CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking

Ruiling Guo; Xinwei Yang; Chen Huang; Tong Zhang; Yong Hu

doi:10.18653/v1/2025.findings-emnlp.307

CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking

Ruiling Guo, Xinwei Yang, Chen Huang, Tong Zhang, Yong Hu

Abstract

The effectiveness of large language models (LLMs) to fact-check misinformation remains uncertain, despite their growing use. To this end, we present CANDY, a benchmark designed to systematically evaluate the capabilities and limitations of LLMs in fact-checking Chinese misinformation. Specifically, we curate a carefully annotated dataset of ~20k instances. Our analysis shows that current LLMs exhibit limitations in generating accurate fact-checking conclusions, even when enhanced with chain-of-thought reasoning and few-shot prompting. To understand these limitations, we develop a taxonomy to categorize flawed LLM-generated explanations for their conclusions and identify factual fabrication as the most common failure mode. Although LLMs alone are unreliable for fact-checking, our findings indicate their considerable potential to augment human performance when deployed as assistive tools in scenarios. Our dataset and code can be accessed at https://github.com/SCUNLP/CANDY.

Anthology ID:: 2025.findings-emnlp.307
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5724–5758
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.307/
DOI:: 10.18653/v1/2025.findings-emnlp.307
Bibkey:
Cite (ACL):: Ruiling Guo, Xinwei Yang, Chen Huang, Tong Zhang, and Yong Hu. 2025. CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 5724–5758, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking (Guo et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.307.pdf
Checklist:: 2025.findings-emnlp.307.checklist.pdf

PDF Cite Search Checklist Fix data