System Report for CCL25-Eval Task 4: Prompting, Scheduling, and Arbitration Strategies for Chinese Factivity Inference

Liu Daohuan, Xia Lun, Yuxuan Zhang, Xinyu Yang, Fanzhen Kong


Abstract
This report presents the methodology and findings of prompting large language models (LLMs) for Chinese Factivity Inference (FI). We evaluated five LLMs, among which DeepSeek-R1 demonstrated the best overall performance. A combination of Chain-of-Thought (CoT), few-shot, and system-level instructions were combined for final prompting. Additionally, we introduced a pairwise task scheduling strategy and a multi-agent disagreement arbitration mechanism to further enhance inference quality. Experimental results show that the integration of prompting, scheduling, and arbitration strategies significantly improves performance, with DeepSeek-R1 achieving 91.7% overall accuracy on the evaluation set. The report also highlights findings regarding LLM behavior on FI tasks and outlines potential directions for future improvement.
Anthology ID:
2025.ccl-2.17
Volume:
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:
August
Year:
2025
Address:
Jinan, China
Editors:
Hongfei Lin, Bin Li, Hongye Tan
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
146–151
Language:
URL:
https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.17/
DOI:
Bibkey:
Cite (ACL):
Liu Daohuan, Xia Lun, Yuxuan Zhang, Xinyu Yang, and Fanzhen Kong. 2025. System Report for CCL25-Eval Task 4: Prompting, Scheduling, and Arbitration Strategies for Chinese Factivity Inference. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 146–151, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):
System Report for CCL25-Eval Task 4: Prompting, Scheduling, and Arbitration Strategies for Chinese Factivity Inference (Daohuan et al., CCL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.17.pdf