SciDQA: A Deep Reading Comprehension Dataset over Scientific Papers

Shruti Singh, Nandan Sarkar, Arman Cohan


Abstract
Scientific literature is typically dense, requiring significant background knowledge and deep comprehension for effective engagement. We introduce SciDQA, a new dataset for reading comprehension that challenges language models to deeply understand scientific articles, consisting of 2,937 QA pairs. Unlike other scientific QA datasets, SciDQA sources questions from peer reviews by domain experts and answers by paper authors, ensuring a thorough examination of the literature. We enhance the dataset’s quality through a process that carefully decontextualizes the content, tracks the source document across different versions, and incorporates a bibliography for multi-document question-answering. Questions in SciDQA necessitate reasoning across figures, tables, equations, appendices, and supplementary materials, and require multi-document reasoning. We evaluate several open-source and proprietary LLMs across various configurations to explore their capabilities in generating relevant and factual responses, as opposed to simple review memorization. Our comprehensive evaluation, based on metrics for surface-level and semantic similarity, highlights notable performance discrepancies. SciDQA represents a rigorously curated, naturally derived scientific QA dataset, designed to facilitate research on complex reasoning within the domain of question answering for scientific texts.
Anthology ID:
2024.emnlp-main.1163
Original:
2024.emnlp-main.1163v1
Version 2:
2024.emnlp-main.1163v2
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20908–20923
Language:
URL:
https://preview.aclanthology.org/add-emnlp-2024-awards/2024.emnlp-main.1163/
DOI:
10.18653/v1/2024.emnlp-main.1163
Bibkey:
Cite (ACL):
Shruti Singh, Nandan Sarkar, and Arman Cohan. 2024. SciDQA: A Deep Reading Comprehension Dataset over Scientific Papers. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20908–20923, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
SciDQA: A Deep Reading Comprehension Dataset over Scientific Papers (Singh et al., EMNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add-emnlp-2024-awards/2024.emnlp-main.1163.pdf