Nan Zhang

Ph.D. Student

The Pennsylvania State University

Biography

Nan Zhang (Chinese: 张楠) is a Ph.D. student in College of Information Sciences and Technology at The Pennsylvania State University. He has broad interests in natural language processing (NLP), clinical NLP, and machine learning. He is advised by Dr. Rui Zhang and Dr. Prasenjit Mitra. He is currently working on LLMs compression (e.g., pruning and quantization) and RAG.

Before joining Penn State, he received his bachelor’s degree from Worcester Polytechnic Institute (WPI) in 2017 and his master’s degree from Georgia Institute of Technology in 2020.

Interests

LLMs Compression
RAG
Natural Language Processing
Machine Learning

Education

PhD in Informatics, 2020 - Present

The Pennsylvania State University
MS in Computational Science and Engineering, 2020

Georgia Institute of Technology
BS in Computer Science & Industrial Engineering (double major), 2017

Worcester Polytechnic Institute

Recent News

All news»

[Apr. 2025] Excited that SiReRAG is accepted by ICLR 2025! My collaborators are presenting it in person during Poster Session 1 (#61 at Hall 3 + Hall 2B). I am happy to discuss research on RAG, LLMs compression, and large reasoning models virtually.

[Apr. 2025] Our benchmarking paper on compressed large reasoning models (LRMs) is online, entitled When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks. We provide detailed analysis on quantized, distilled, and pruned reasoning models!

[Dec. 2024] Our RAG indexing paper on similar and related corpus contents is online, entitled SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. Our paper consistently outperforms current indexing works on multihop datasets!

[Sept. 2024] One paper on LLMs as paper reviewers and area chairs has been accepted to EMNLP 2024.

[Aug. 2024] One paper on self-correction of LLMs has been accepted to TACL 2024.

Publications

Nan Zhang, Yusen Zhang, Prasenjit Mitra, Rui Zhang (2025). When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks. Preprint, 2025.

PDF

Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri, Gabriel Bernadett-Shapiro, Rui Zhang, Prasenjit Mitra, Caiming Xiong, Chien-Sheng Wu (2025). SiReRAG: Indexing Similar and Related Information for Multihop Reasoning. ICLR, 2025.

PDF Code

Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo, Jing Gu, Haoran Li, Kangda Wei, Zihao Wang, Lu Cheng, Surangika Ranathunga, Meng Fang, Jie Fu, Fei Liu, Ruihong Huang, Eduardo Blanco, Yixin Cao, Rui Zhang, Philip S Yu, Wenpeng Yin (2024). LLMs assist NLP Researchers: Critique Paper (Meta-) Reviewing. EMNLP, 2024.

PDF

Ryo Kamoi, Yusen Zhang, Nan Zhang, Jiawei Han, Rui Zhang (2024). When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs. TACL, 2024.

PDF

Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang (2024). Evaluating LLMs at Detecting Errors in LLM Responses. COLM, 2024.

PDF Code Dataset

Nan Zhang, Yanchi Liu, Xujiang Zhao, Wei Cheng, Runxue Bao, Rui Zhang, Prasenjit Mitra, Haifeng Chen (2024). Pruning as a Domain-specific LLM Extractor. NAACL Findings, 2024.

PDF Code

Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang (2024). Fair Abstractive Summarization of Diverse Perspectives. NAACL, 2024.

PDF Code

Nan Zhang, Connor Heaton, Sean Timothy Okonsky, Prasenjit Mitra, Hilal Ezgi Toraman (2024). PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents. LREC-COLING, 2024.

PDF Code

Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao (2024). Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models. Preprint, 2024.

PDF

Nan Zhang, Yusen Zhang, Wu Guo, Prasenjit Mitra, Rui Zhang (2023). FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization. EMNLP, 2023.

PDF Code

Nan Zhang, Shomir Wilson, Prasenjit Mitra (2022). STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents. LREC, 2022.

PDF Code