Abstract
Humans do not make inferences over texts, but over models of what texts are about. When annotators are asked to annotate coreferent spans of text, it is therefore a somewhat unnatural task. This paper presents an alternative in which we preprocess documents, linking entities to a knowledge base, and turn the coreference annotation task – in our case limited to pronouns – into an annotation task where annotators are asked to assign pronouns to entities. Model-based annotation is shown to lead to faster annotation and higher inter-annotator agreement, and we argue that it also opens up an alternative approach to coreference resolution. We present two new coreference benchmark datasets, for English Wikipedia and English teacher-student dialogues, and evaluate state-of-the-art coreference resolvers on them.- Anthology ID:
- 2020.lrec-1.9
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 74–79
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.9
- DOI:
- Cite (ACL):
- Rahul Aralikatte and Anders Søgaard. 2020. Model-based Annotation of Coreference. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 74–79, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Model-based Annotation of Coreference (Aralikatte & Søgaard, LREC 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.lrec-1.9.pdf
- Code
- rahular/model-based-coref
- Data
- QuAC, WikiCoref