Zhixing Li
2026
CL2GEC: A Multi-Discipline Benchmark for Continual Learning in Chinese Literature Grammatical Error Correction
Shang Qin | Jingheng Ye | Yinghui Li | Hai-Tao Zheng | Qi Li | Jinxiao Shan | Zhixing Li | Hong-Gee Kim
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shang Qin | Jingheng Ye | Yinghui Li | Hai-Tao Zheng | Qi Li | Jinxiao Shan | Zhixing Li | Hong-Gee Kim
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The growing demand for automated writing assistance in diverse academic domains highlights the need for robust Chinese Grammatical Error Correction (CGEC) systems that can adapt across disciplines. However, existing CGEC research largely lacks dedicated benchmarks for multi-disciplinary academic writing, overlooking continual learning (CL) as a promising solution to handle domain-specific linguistic variation and prevent catastrophic forgetting. To fill this crucial gap, we introduce CL2GEC, the first Continual Learning benchmark for Chinese Literature Grammatical Error Correction, designed to evaluate adaptive CGEC across multiple academic fields. Our benchmark includes 10,000 human-annotated sentences spanning 10 disciplines, each exhibiting distinct linguistic styles and error patterns. CL2GEC focuses on evaluating grammatical error correction in a continual learning setting, simulating sequential exposure to diverse academic disciplines to reflect real-world editorial dynamics. We evaluate large language models under sequential tuning, parameter-efficient adaptation, and four representative CL algorithms, using both standard GEC metrics and continual learning metrics adapted to task-level variation. Experimental results reveal that regularization-based methods mitigate forgetting more effectively than replay-based or naive sequential approaches. Our benchmark provides a rigorous foundation for future research in adaptive grammatical error correction across diverse academic domains.
2014
Language Processing Infrastructure in the XLike Project
Lluís Padró | Željko Agić | Xavier Carreras | Blaz Fortuna | Esteban García-Cuesta | Zhixing Li | Tadej Štajner | Marko Tadić
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Lluís Padró | Željko Agić | Xavier Carreras | Blaz Fortuna | Esteban García-Cuesta | Zhixing Li | Tadej Štajner | Marko Tadić
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper presents the linguistic analysis tools and its infrastructure developed within the XLike project. The main goal of the implemented tools is to provide a set of functionalities for supporting some of the main objectives of XLike, such as enabling cross-lingual services for publishers, media monitoring or developing new business intelligence applications. The services cover seven major and minor languages: English, German, Spanish, Chinese, Catalan, Slovenian, and Croatian. These analyzers are provided as web services following a lightweight SOA architecture approach, and they are publically callable and are catalogued in META-SHARE.
XLike Project Language Analysis Services
Xavier Carreras | Lluís Padró | Lei Zhang | Achim Rettinger | Zhixing Li | Esteban García-Cuesta | Željko Agić | Božo Bekavac | Blaz Fortuna | Tadej Štajner
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics
Xavier Carreras | Lluís Padró | Lei Zhang | Achim Rettinger | Zhixing Li | Esteban García-Cuesta | Željko Agić | Božo Bekavac | Blaz Fortuna | Tadej Štajner
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics