Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui


Abstract
Natural language processing (NLP) technology has rapidly improved automated grammatical error correction (GEC) tasks, and the GEC community has begun to explore document-level revision. However, there are two major obstacles to going beyond automated sentence-level GEC to NLP-based document-level revision support: (1) there are few public corpora with document-level revisions annotated by professional editors, and (2) it is infeasible to obtain all possible references and evaluate revision quality using such references because there are infinite revision possibilities. To address these challenges, this paper proposes a new document revision corpus, Text Revision of ACL papers (TETRA), in which multiple professional editors have revised academic papers sampled from the ACL anthology. This corpus enables us to focus on document-level and paragraph-level edits, such as edits related to coherence and consistency. Additionally, as a case study using the TETRA corpus, we investigate reference-less and interpretable methods for meta-evaluation to detect quality improvements according to document revisions. We show the uniqueness of TETRA compared with existing document revision corpora and demonstrate that a fine-tuned pre-trained language model can discriminate the quality of documents after revision even when the difference is subtle.
Anthology ID:
2024.bea-1.21
Volume:
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Ekaterina Kochmar, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
251–265
Language:
URL:
https://aclanthology.org/2024.bea-1.21
DOI:
Bibkey:
Cite (ACL):
Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, and Kentaro Inui. 2024. Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond. In Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), pages 251–265, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond (Mita et al., BEA 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.bea-1.21.pdf