Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation

Jingshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang, Xinying Qiu


Abstract
“This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types per sentence. For Track 3, where we achieved first place, we generated fluency-rated pseudo-data via back-translation for pretraining and used an NSP-based strategy with Symmetric Cross Entropy loss to capture context and mitigate long dependencies. Our methods effectively address key challenges in Chinese Essay Fluency Evaluation.”
Anthology ID:
2024.ccl-3.30
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Lin Hongfei, Tan Hongye, Li Bin
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
269–277
Language:
English
URL:
https://preview.aclanthology.org/author-degibert/2024.ccl-3.30/
DOI:
Bibkey:
Cite (ACL):
Jingshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang, and Xinying Qiu. 2024. Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 269–277, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation (Zhang et al., CCL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-degibert/2024.ccl-3.30.pdf