Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation

Hui Huang; Julien Velcin; Yacine Kessaci

doi:10.18653/v1/2025.findings-emnlp.576

Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation

Hui Huang, Julien Velcin, Yacine Kessaci

Abstract

Question-Answering (QA) systems are vital for rapidly accessing and comprehending information in academic literature.However, some academic questions require synthesizing information across multiple documents. While several prior resources consider multi-document QA, they often do not strictly enforce cross-document synthesis or exploit the explicit inter-paper structure that links sources.To address this, we introduce a pipeline methodology for constructing a Multi-Document Academic QA (MDA-QA) dataset. By both detecting communities based on citation networks and leveraging Large Language Models (LLMs), we were able to form thematically coherent communities and generate QA pairs related to multi-document content automatically.We further develop an automated filtering mechanism to ensure multi-document dependence.Our resulting dataset consists of 6,804 QA pairs and serves as a benchmark for evaluating multi-document retrieval and QA systems.Our experimental results highlight that standard lexical and embedding-based retrieval methods struggle to locate all relevant documents, indicating a persistent gap in multi-document reasoning. We release our dataset and source code for the community.

Anthology ID:: 2025.findings-emnlp.576
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10867–10881
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.576/
DOI:: 10.18653/v1/2025.findings-emnlp.576
Bibkey:
Cite (ACL):: Hui Huang, Julien Velcin, and Yacine Kessaci. 2025. Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 10867–10881, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Towards Multi-Document Question Answering in Scientific Literature: Pipeline, Dataset, and Evaluation (Huang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.576.pdf
Checklist:: 2025.findings-emnlp.576.checklist.pdf

PDF Cite Search Checklist Fix data