Xinglu Chen
2024
Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese
Jingshen Zhang
|
Xinglu Chen
|
Xinying Qiu
|
Zhimin Wang
|
Wenhe Feng
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“Chinese sentence simplification faces challenges due to the lack of large-scale labeledparallel corpora and the prevalence of idioms. To address these challenges, we pro-pose Readability-guided Idiom-aware Sentence Simplification (RISS), a novel frameworkthat combines data augmentation techniques. RISS introduces two key components: (1)Readability-guided Paraphrase Selection (RPS), a method for mining high-quality sen-tence pairs, and (2) Idiom-aware Simplification (IAS), a model that enhances the compre-hension and simplification of idiomatic expressions. By integrating RPS and IAS usingmulti-stage and multi-task learning strategies, RISS outperforms previous state-of-the-artmethods on two Chinese sentence simplification datasets. Furthermore, RISS achievesadditional improvements when fine-tuned on a small labeled dataset. Our approachdemonstrates the potential for more effective and accessible Chinese text simplification.”
Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation
Jingshen Zhang
|
Xiangyu Yang
|
Xinkai Su
|
Xinglu Chen
|
Tianyou Huang
|
Xinying Qiu
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types per sentence. For Track 3, where we achieved first place, we generated fluency-rated pseudo-data via back-translation for pretraining and used an NSP-based strategy with Symmetric Cross Entropy loss to capture context and mitigate long dependencies. Our methods effectively address key challenges in Chinese Essay Fluency Evaluation.”
Search
Fix author
Co-authors
- Xin Ying Qiu 2
- Jingshen Zhang 2
- Wenhe Feng (冯文贺) 1
- Tianyou Huang 1
- Xinkai Su 1
- show all...
Venues
- ccl2