Long-Form Information Alignment Evaluation Beyond Atomic Facts

Danna Zheng; Mirella Lapata; Jeff Z. Pan

doi:10.18653/v1/2025.emnlp-main.558

Long-Form Information Alignment Evaluation Beyond Atomic Facts

Danna Zheng, Mirella Lapata, Jeff Z. Pan

Abstract

Information alignment evaluators are vital for various NLG evaluation tasks and trustworthy LLM deployment, reducing hallucinations and enhancing user trust. Current fine-grained methods, like FactScore, verify facts individually but neglect inter-fact dependencies, enabling subtle vulnerabilities.In this work, we introduce MontageLie, a challenging benchmark that constructs deceptive narratives by “montaging” truthful statements without introducing explicit hallucinations.We demonstrate that both coarse-grained LLM-based evaluators and current fine-grained frameworks are susceptible to this attack, with AUC-ROC scores falling below 65%.To enable more robust fine-grained evaluation, we propose DoveScore, a novel framework that jointly verifies factual accuracy and event-order consistency. By modeling inter-fact relationships, DoveScore outperforms existing fine-grained methods by over 8%, providing a more robust solution for long-form text alignment evaluation. Our code and datasets are available at https://github.com/dannalily/DoveScore.

Anthology ID:: 2025.emnlp-main.558
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11018–11038
Language:
URL:: https://preview.aclanthology.org/corrections-2025-11/2025.emnlp-main.558/
DOI:: 10.18653/v1/2025.emnlp-main.558
Bibkey:
Cite (ACL):: Danna Zheng, Mirella Lapata, and Jeff Z. Pan. 2025. Long-Form Information Alignment Evaluation Beyond Atomic Facts. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 11018–11038, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Long-Form Information Alignment Evaluation Beyond Atomic Facts (Zheng et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2025-11/2025.emnlp-main.558.pdf
Checklist:: 2025.emnlp-main.558.checklist.pdf

PDF Cite Search Checklist Fix data