Less Can be More: An Empirical Evaluation of Small and Large Language Models for Sentence-level Claim Detection

Andrew Bell


Abstract
Sentence-level claim detection is a critical first step in the fact-checking process. While Large Language Models (LLMs) seem well-suited for claim detection, their computational cost poses challenges for real-world deployment. This paper investigates the effectiveness of both small and large pretrained Language Models for the task of claim detection. We conduct a comprehensive empirical evaluation using BERT, ModernBERT, RoBERTa, Llama, and ChatGPT-based models. Our results reveal that smaller models, when finetuned appropriately, can achieve competitive performance with significantly lower computational overhead on in-domain tasks. Notably, we also find that BERT-based models transfer poorly on sentence-level claim detection in out-of-domain tasks. We discuss the implications of these findings for practitioners and highlight directions for future research.
Anthology ID:
2025.fever-1.6
Volume:
Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Mubashara Akhtar, Rami Aly, Christos Christodoulopoulos, Oana Cocarascu, Zhijiang Guo, Arpit Mittal, Michael Schlichtkrull, James Thorne, Andreas Vlachos
Venues:
FEVER | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
85–90
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.fever-1.6/
DOI:
Bibkey:
Cite (ACL):
Andrew Bell. 2025. Less Can be More: An Empirical Evaluation of Small and Large Language Models for Sentence-level Claim Detection. In Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER), pages 85–90, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Less Can be More: An Empirical Evaluation of Small and Large Language Models for Sentence-level Claim Detection (Bell, FEVER 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.fever-1.6.pdf