Biography

Nan Zhang (Chinese: 张楠) is a Ph.D. student in College of Information Sciences and Technology at The Pennsylvania State University. He has broad interests in natural language processing, clinical NLP, and machine learning. He is advised by Dr. Rui Zhang and Dr. Prasenjit Mitra. He is currently working on LLMs compression and summarization.

Before joining Penn State, he received his bachelor’s degree from Worcester Polytechnic Institute (WPI) in 2017 and his master’s degree from Georgia Institute of Technology in 2020.

Interests
  • LLMs Compression
  • Natural Language Processing
  • Clinical NLP
  • Machine Learning
Education
  • PhD in Informatics, 2020 - Present

    The Pennsylvania State University

  • MS in Computational Science and Engineering, 2020

    Georgia Institute of Technology

  • BS in Computer Science & Industrial Engineering (double major), 2017

    Worcester Polytechnic Institute

Recent News

All news»

[Sept. 2024] One paper on LLMs as paper reviewers and area chairs has been accepted to EMNLP 2024.

[Aug. 2024] One paper on self-correction of LLMs has been accepted to TACL 2024.

[July 2024] One paper on error detection benchmark of LLMs has been accepted to COLM 2024.

[June 2024] Our survey paper on self-correction of LLMs is online, entitled When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs. Feel free to check it out!

[Apr. 2024] I will join Salesforce AI Research as a Research Intern at Palo Alto, CA in summer 2024!

Publications

(2024). LLMs assist NLP Researchers: Critique Paper (Meta-) Reviewing. EMNLP, 2024.

PDF

(2024). When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs. TACL, 2024.

PDF

(2024). Evaluating LLMs at Detecting Errors in LLM Responses. COLM, 2024.

PDF Code Dataset

(2024). Pruning as a Domain-specific LLM Extractor. NAACL Findings, 2024.

PDF Code

(2024). Fair Abstractive Summarization of Diverse Perspectives. NAACL, 2024.

PDF Code

(2024). PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents. LREC-COLING, 2024.

PDF Code

(2024). Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models. Preprint, 2024.

PDF

(2023). FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization. EMNLP, 2023.

PDF Code