Ruitong Liu
2025
CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction
Jingheng Ye
|
Zishan Xu
|
Yinghui Li
|
Linlin Song
|
Qingyu Zhou
|
Hai-Tao Zheng
|
Ying Shen
|
Wenhao Jiang
|
Hong-Gee Kim
|
Ruitong Liu
|
Xin Su
|
Zifei Shan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The paper focuses on the interpretability of Grammatical Error Correction (GEC) evaluation metrics, which received little attention in previous studies. To bridge the gap, we introduce **CLEME2.0**, a reference-based metric describing four fundamental aspects of GEC systems: hit-correction, wrong-correction, under-correction, and over-correction. They collectively contribute to exposing critical qualities and locating drawbacks of GEC systems. Evaluating systems by combining these aspects also leads to superior human consistency over other reference-based and reference-less metrics. Extensive experiments on two human judgment datasets and six reference datasets demonstrate the effectiveness and robustness of our method, achieving a new state-of-the-art result. Our codes are released at https://github.com/THUKElab/CLEME.
Search
Fix author
Co-authors
- Wenhao Jiang 1
- Hong-Gee Kim 1
- Yinghui Li 1
- Zifei Shan 1
- Ying Shen 1
- show all...
Venues
- acl1