System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5

Haotao Xie

System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5

Abstract

"Recently, large language models (LLMs) have achieved promising progress in the fields of classical Chinese translation and the generation of classical poetry. However, domain-specific re-search on precise translation and affective-semantic understanding of classical poetry remains limited. The main challenge is that most studies treat the poetic appreciation task as a general-domain problem, neglecting the distinctive features of poetic appreciation, while high-quality and domain-specific datasets are extremely limited. To address this limitation, we decompose the task into three subtasks: term interpretation, semantic interpretation, and emotional inference.Based on multiple open-source datasets, we perform data cleansing and alignment to construct the Classical Chinese Poetry Instruction Pair Dataset (CCPoetry-49K), which comprises 49,404high-quality instruction–response pairs explicitly optimized for this domain. We then propose a domain-specialized LLM, called PoetryQwen, by applying Low-Rank Adaptation (LoRA) to fine-tune the Qwen2.5-14B model. Experimental results on the CCL25-Eval Task 5 benchmark demonstrate that PoetryQwen achieves a score of 0.757, representing a 9.7% improvement over the Qwen2.5-14B-Instruct baseline (0.690). These findings clearly indicate that Poetry Qwen significantly enhances performance in precise translation and emotional understanding of classical poetry. We present new dataset and methodological considerations intended to support the domain-specific optimization of LLMs."

Anthology ID:: 2025.ccl-2.24
Volume:: Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:: August
Year:: 2025
Address:: Jinan, China
Editors:: Hongfei Lin, Bin Li, Hongye Tan
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 200–205
Language:
URL:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.24/
DOI:
Bibkey:
Cite (ACL):: Haotao Xie. 2025. System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 200–205, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):: System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5 (Xie, CCL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-2.24.pdf

PDF Cite Search Fix data