Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025

Guanliang Cong; Junchao Wu; Chen Yang; Tianqi Xun; Derek F. Wong (黄辉); Bin Li; Yulin Yuan

Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025

Guanliang Cong, Junchao Wu, Chen Yang, Tianqi Xun, Derek F. Wong, Bin Li, Yulin Yuan

Abstract

"This paper presents the results of the FIE2025, a shared task aimed at evaluating the ability of Large Language Models (LLMs) to perform factivity inference on Chinese texts: whether LLMs can correctly discern the veridical information of propositions encoded in the complement clauses. The responses to the task mirror the extent to which LLMs can grasp the implicit truth judgments made by human speakers through texts, as well as their subjective stances. Such a capability is crucial for autonomous inference in intelligent agents and for achieving fluid human–AI interaction. The task was hosted on the Alibaba Tianchi platform and evaluated through two tracks: with and without finetuning. A mixed dataset was constructed, combining both synthetic sentences and authentic corpus instances. The dataset comprises a total of about 3,000 items labeled by expert linguists, including 845 (300+545) manually created items and 2,143 (700+1,443) items selected from existing corpus. 404 results proposed by 74 teams were successfully submitted to Tianchi system. Overall, under current technological conditions, the key to successful factivity inference lies in whether LLMs effectively identify different types of predicates and various contextual conditions from the given texts. Models that support long-context prompt inputs tend to achieve the best inference performance when provided with numerous shots. This shared task deepened our understanding of the factivity phenomenon in Chinese, expanded the influence of factivity research within the field of natural language processing, and provided an exploratory precedent for future activities focusing on factivity inference in Chinese and potentially other languages."

Anthology ID:: 2025.ccl-2.20
Volume:: Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:: August
Year:: 2025
Address:: Jinan, China
Editors:: Hongfei Lin, Bin Li, Hongye Tan
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 166–180
Language:
URL:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.20/
DOI:
Bibkey:
Cite (ACL):: Guanliang Cong, Junchao Wu, Chen Yang, Tianqi Xun, Derek F. Wong, Bin Li, and Yulin Yuan. 2025. Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 166–180, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):: Overview of CCL25-Eval Task 4: Factivity Inference Evaluation 2025 (Cong et al., CCL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.20.pdf

PDF Cite Search Fix data