Abdullah Aljebreen


2025

pdf bib
UniT: One Document, Many Revisions, Too Many Edit Intention Taxonomies
Fangping Lan | Abdullah Aljebreen | Eduard Dragut
Findings of the Association for Computational Linguistics: ACL 2025

Writing is inherently iterative, each revision enhancing information representation. One revision may contain many edits. Examination of the intentions behind edits provides valuable insights into an editor’s expertise, the dynamics of collaborative writing, and the evolution of a document. Current research on edit intentions lacks a comprehensive edit intention taxonomy (EIT) that spans multiple application domains. As a result, researchers often create new EITs tailored to specific needs, a process that is both time-consuming and costly. To address this gap, we propose UniT, a Unified edit intention Taxonomy that integrates existing EITs encompassing a wide range of edit intentions. We examine the lineage relationship and the construction of 24 EITs. They together have 232 categories across various domains. During the literature survey and integration process, we identify challenges such as one-to-many category matches, incomplete definitions, and varying hierarchical structures. We propose solutions for resolving these issues. Finally, our evaluation shows that our UniT achieves higher inter-annotator agreement scores compared to existing EITs and is applicable to a large set of application domains.