When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits

Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, Scott A. Hale


Abstract
Online misinformation remains a critical challenge, and fact-checkers increasingly rely on claim matching systems that use sentence embedding models to retrieve relevant fact-checks. However, as users interact with claims online, they often introduce edits, and it remains unclear whether current embedding models used in retrieval are robust to such edits. To investigate this, we introduce a perturbation framework that generates valid and natural claim variations, enabling us to assess the robustness of a wide-range of sentence embedding models in a multi-stage retrieval pipeline and evaluate the effectiveness of various mitigation approaches. Our evaluation reveals that standard embedding models exhibit notable performance drops on edited claims, while LLM-distilled embedding models offer improved robustness at a higher computational cost. Although a strong reranker helps to reduce the performance drop, it cannot fully compensate for first-stage retrieval gaps. To address these retrieval gaps, we evaluate train- and inference-time mitigation approaches, demonstrating that they can improve in-domain robustness by up to 17 percentage points and boost out-of-domain generalization by 10 percentage points. Overall, our findings provide practical improvements to claim-matching systems, enabling more reliable fact-checking of evolving misinformation.
Anthology ID:
2025.findings-acl.1150
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22374–22404
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.1150/
DOI:
10.18653/v1/2025.findings-acl.1150
Bibkey:
Cite (ACL):
Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, and Scott A. Hale. 2025. When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits. In Findings of the Association for Computational Linguistics: ACL 2025, pages 22374–22404, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits (Magomere et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.1150.pdf