RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension

Yelin Chen; Fanjin Zhang; Suping Sun; Yunhe Pang; Yuanchun Wang; Jian Song; XiaoYan Li; Lei Hou; Shu Zhao; Jie Tang; Juanzi Li

RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension

Yelin Chen, Fanjin Zhang, Suping Sun, Yunhe Pang, Yuanchun Wang, Jian Song, XiaoYan Li, Lei Hou, Shu Zhao, Jie Tang, Juanzi Li

Abstract

Understanding research papers remains challenging for foundation models due to specialized scientific discourse and complex figures and tables, yet existing benchmarks offer limited fine-grained evaluation at scale. To address this gap, we introduce RPC-Bench, a large-scale question-answering benchmark built from review–rebuttal exchanges of high-quality computer science papers, containing 15K human-verified QA pairs. We design a fine-grained taxonomy aligned with the scientific research flow to assess models’ ability to understand and answer why, what, and how questions in scholarly contexts. We also define an elaborate LLM–human interaction annotation framework to support large-scale labeling and quality control. Following the LLM-as-a-Judge paradigm, we develop a scalable framework that evaluates models on correctness-completeness and conciseness, with high agreement to human judgment. Experiments reveal that even the strongest models (GPT-5) achieve only 68.2% correctness-completeness, dropping to 37.46% after conciseness adjustment, highlighting substantial gaps in precise academic paper understanding.

Anthology ID:: 2026.acl-long.1277
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27683–27717
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1277/
DOI:
Bibkey:
Cite (ACL):: Yelin Chen, Fanjin Zhang, Suping Sun, Yunhe Pang, Yuanchun Wang, Jian Song, XiaoYan Li, Lei Hou, Shu Zhao, Jie Tang, and Juanzi Li. 2026. RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 27683–27717, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension (Chen et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1277.pdf
Checklist:: 2026.acl-long.1277.checklist.pdf

PDF Cite Search Checklist Fix data