Biography

Nan Zhang (Chinese: 张楠) is a Ph.D. student in College of Information Sciences and Technology at The Pennsylvania State University. He has broad interests in natural language processing (NLP), machine learning, and efficient AI. He is advised by Dr. Rui Zhang and Dr. Prasenjit Mitra.

Previously, he interned at Salesforce AI Research and NEC Labs America. Before joining Penn State, he received his bachelor’s degree from Worcester Polytechnic Institute (WPI) and master’s degree from Georgia Institute of Technology.

He works on proposing generalizable and efficient approaches for both learning algorithms and ML systems. Specifically, he aims to unlock the full potential of LLMs.

  • LLMs Compression: Domain-specific pruning (NAACL 24 Findings), mechanistic interpretation of compressed LLMs (ICLR 26), and sub-4-bit quantization (preprint).
  • RAG: RAG indexing for smarter reasoning (ICLR 25).
  • Post-Training: Faithful medical summarization via SFT (EMNLP 23).

Recent News

All news»

[Feb. 2026] Our quantization paper is online, entitled QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals. Our method delivers a consistent improvement for LRMs quantization, with an average improvement of 6.55% on an RL fine-tuned model!

[Jan. 2026] Excited that our LRMs compression benchmarking and interpreation paper is accepted by ICLR 2026!

[Oct. 2025] Our benchmarking and interpreation on compressed large reasoning models (LRMs) is online, entitled When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models. We provide analysis on quantized, distilled, and pruned LRMs to decode the effects of compression!

[Sept. 2025] Our paper on creating training data for Process Reward Models (PRMs) is online, entitled Generalizable Process Reward Models via Formally Verified Training Data. Feel free to check it out!

[June 2025] I am honored to receive 2024-25 Vice Provost and Dean of the Graduate School Student Persistence Scholarship!

Selected Publications

All publications »

(2026). QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals. Preprint, 2026.

PDF Code

(2026). When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models. ICLR, 2026.

PDF Code

(2025). SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. ICLR, 2025.

PDF Code

(2024). Pruning as a Domain-specific LLM Extractor. NAACL Findings, 2024.

PDF Code

(2023). FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization. EMNLP, 2023.

PDF Code