Biography

Nan Zhang (Chinese: 张楠) is a Ph.D. student in College of Information Sciences and Technology at The Pennsylvania State University. He has broad interests in natural language processing (NLP), clinical NLP, and machine learning. He is advised by Dr. Rui Zhang and Dr. Prasenjit Mitra. He is currently working on LLMs compression (e.g., pruning and quantization) and RAG.

Before joining Penn State, he received his bachelor’s degree from Worcester Polytechnic Institute (WPI) in 2017 and his master’s degree from Georgia Institute of Technology in 2020.

Interests
  • LLMs Compression
  • RAG
  • Natural Language Processing
  • Machine Learning
Education
  • PhD in Informatics, 2020 - Present

    The Pennsylvania State University

  • MS in Computational Science and Engineering, 2020

    Georgia Institute of Technology

  • BS in Computer Science & Industrial Engineering (double major), 2017

    Worcester Polytechnic Institute

Recent News

All news»

[Apr. 2025] Excited that SiReRAG is accepted by ICLR 2025! My collaborators are presenting it in person during Poster Session 1 (#61 at Hall 3 + Hall 2B). I am happy to discuss research on RAG, LLMs compression, and large reasoning models virtually.

[Apr. 2025] Our benchmarking paper on compressed large reasoning models (LRMs) is online, entitled When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks. We provide detailed analysis on quantized, distilled, and pruned reasoning models!

[Dec. 2024] Our RAG indexing paper on similar and related corpus contents is online, entitled SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. Our paper consistently outperforms current indexing works on multihop datasets!

[Sept. 2024] One paper on LLMs as paper reviewers and area chairs has been accepted to EMNLP 2024.

[Aug. 2024] One paper on self-correction of LLMs has been accepted to TACL 2024.

Publications

(2025). When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks. Preprint, 2025.

PDF

(2025). SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. ICLR, 2025.

PDF Code

(2024). LLMs assist NLP Researchers: Critique Paper (Meta-) Reviewing. EMNLP, 2024.

PDF

(2024). When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs. TACL, 2024.

PDF

(2024). Evaluating LLMs at Detecting Errors in LLM Responses. COLM, 2024.

PDF Code Dataset

(2024). Pruning as a Domain-specific LLM Extractor. NAACL Findings, 2024.

PDF Code

(2024). Fair Abstractive Summarization of Diverse Perspectives. NAACL, 2024.

PDF Code

(2024). PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents. LREC-COLING, 2024.

PDF Code

(2024). Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models. Preprint, 2024.

PDF

(2023). FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization. EMNLP, 2023.

PDF Code