Do We Need Language-Specific Fact-Checking Models? The Case of Chinese

Caiqi Zhang; Zhijiang Guo; Andreas Vlachos

doi:10.18653/v1/2024.emnlp-main.113

Do We Need Language-Specific Fact-Checking Models? The Case of Chinese

Caiqi Zhang, Zhijiang Guo, Andreas Vlachos

Abstract

This paper investigates the potential benefits of language-specific fact-checking models, focusing on the case of Chinese using CHEF dataset. To better reflect real-world fact-checking, we first develop a novel Chinese document-level evidence retriever, achieving state-of-the-art performance. We then demonstrate the limitations of translation-based methods and multilingual language models, highlighting the need for language-specific systems. To better analyze token-level biases in different systems, we construct an adversarial dataset based on the CHEF dataset, where each instance has a large word overlap with the original one but holds the opposite veracity label. Experimental results on the CHEF dataset and our adversarial dataset show that our proposed method outperforms translation-based methods and multilingual language models and is more robust toward biases, emphasizing the importance of language-specific fact-checking systems.

Anthology ID:: 2024.emnlp-main.113
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1899–1914
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.emnlp-main.113/
DOI:: 10.18653/v1/2024.emnlp-main.113
Bibkey:
Cite (ACL):: Caiqi Zhang, Zhijiang Guo, and Andreas Vlachos. 2024. Do We Need Language-Specific Fact-Checking Models? The Case of Chinese. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1899–1914, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Do We Need Language-Specific Fact-Checking Models? The Case of Chinese (Zhang et al., EMNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.emnlp-main.113.pdf

PDF Cite Search Fix data