Abstract
Metric validation in Grammatical Error Correction (GEC) is currently done by observing the correlation between human and metric-induced rankings. However, such correlation studies are costly, methodologically troublesome, and suffer from low inter-rater agreement. We propose MAEGE, an automatic methodology for GEC metric validation, that overcomes many of the difficulties in the existing methodology. Experiments with MAEGE shed a new light on metric quality, showing for example that the standard M2 metric fares poorly on corpus-level ranking. Moreover, we use MAEGE to perform a detailed analysis of metric behavior, showing that some types of valid edits are consistently penalized by existing metrics.- Anthology ID:
- P18-1127
- Volume:
- Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1372–1382
- Language:
- URL:
- https://aclanthology.org/P18-1127
- DOI:
- 10.18653/v1/P18-1127
- Cite (ACL):
- Leshem Choshen and Omri Abend. 2018. Automatic Metric Validation for Grammatical Error Correction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1372–1382, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- Automatic Metric Validation for Grammatical Error Correction (Choshen & Abend, ACL 2018)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/P18-1127.pdf
- Code
- borgr/EoE