Evaluating the Impact of Source Diversity for RAG in Historical Research
Ruhi Mahadeshwar, Andreas van Cranenburgh, Tommaso Caselli, Malvina Nissim
Abstract
Historical research increasingly benefits from large language models (LLMs). However, LLMs are prone to factual inaccuracy, unreliability, and biased interpretations of data. Retrieval-augmented generation (RAG) approaches have emerged as solutions, but may inadvertently perpetuate biased perspectives embedded in historical archives. This paper investigates how source diversity in RAG impacts perspective variation in historical question answering. We compile a multilingual corpus (English, French, Dutch) of historical documents spanning multiple countries and focus on Napoleon Bonaparte. We evaluate three Qwen3 models across ten questions using a multi-layered framework combining traditional metrics (BERTScore, ROUGE-L), frame semantics analysis, and syntactic profiling. Our results highlight that, while traditional similarity metrics suggest high semantic consistency, frame-semantic analysis exposes substantial perspective shifts. Baseline answers present "flattened" cross-lingual perspectives, whereas RAG introduces diversity. Critically, this diversity manifests differently across languages, demonstrating language-specific patterns. Our findings highlight limitations of traditional evaluation metrics for perspective-sensitive tasks and demonstrate that RAG constitutes active perspective transformation rather than neutral augmentation.- Anthology ID:
- 2026.lrec-main.53
- Volume:
- Proceedings of the Fifteenth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2026
- Address:
- Palma de Mallorca, Spain
- Editors:
- Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
- Venue:
- LREC
- SIG:
- Publisher:
- ELRA Language Resource Association
- Note:
- Pages:
- 716–734
- Language:
- URL:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.53/
- DOI:
- Cite (ACL):
- Ruhi Mahadeshwar, Andreas van Cranenburgh, Tommaso Caselli, and Malvina Nissim. 2026. Evaluating the Impact of Source Diversity for RAG in Historical Research. International Conference on Language Resources and Evaluation, main:716–734.
- Cite (Informal):
- Evaluating the Impact of Source Diversity for RAG in Historical Research (Mahadeshwar et al., LREC 2026)
- PDF:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.53.pdf