Zhaxi Zerong
2025
A Systematic Survey of Claim Verification: Corpora, Systems, and Case Studies
Zhaxi Zerong
|
Chenxi Li
|
Xinyi Liu
|
Ju-hui Chen
|
Fei Xia
Findings of the Association for Computational Linguistics: EMNLP 2025
Automated Claim Verification (CV)—the task of assessing a claim’s veracity against explicitly provided evidence—is a critical tool in the fight against growing misinformation. This survey offers a comprehensive analysis of 198 studies published between January 2022 and March 2025, synthesizing recent advances in CV corpus creation and system design. Through two in-depth case studies, we illuminate persistent challenges in veracity annotation, limitations of conventional CV pipelines, and pitfalls in recent claim decomposition approaches. We conclude by identifying key unresolved challenges and proposing productive directions for future research.
2024
Challenging Large Language Models with New Tasks: A Study on their Adaptability and Robustness
Chenxi Li
|
Yuanhe Tian
|
Zhaxi Zerong
|
Yan Song
|
Fei Xia
Findings of the Association for Computational Linguistics: ACL 2024
Recent progress in large language models (LLMs) has marked a notable milestone in the field of artificial intelligence. The conventional evaluation of LLMs primarily relies on existing tasks and benchmarks, raising concerns about test set contamination and the genuine comprehension abilities of LLMs. To address these concerns, we propose to evaluate LLMs by designing new tasks, automatically generating evaluation datasets for the tasks, and conducting detailed error analyses to scrutinize LLMs’ adaptability to new tasks, their sensitivity to prompt variations, and their error tendencies. We investigate the capacity of LLMs to adapt to new but simple tasks, especially when they diverge from the models’ pre-existing knowledge. Our methodology emphasizes the creation of straightforward tasks, facilitating a precise error analysis to uncover the underlying causes of LLM failures. This strategic approach also aims to uncover effective strategies for enhancing LLM performance based on the detailed error analysis of system output.