A Probabilistic Annotation Model for Crowdsourcing Coreference
Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
Abstract
The availability of large scale annotated corpora for coreference is essential to the development of the field. However, creating resources at the required scale via expert annotation would be too expensive. Crowdsourcing has been proposed as an alternative; but this approach has not been widely used for coreference. This paper addresses one crucial hurdle on the way to make this possible, by introducing a new model of annotation for aggregating crowdsourced anaphoric annotations. The model is evaluated along three dimensions: the accuracy of the inferred mention pairs, the quality of the post-hoc constructed silver chains, and the viability of using the silver chains as an alternative to the expert-annotated chains in training a state of the art coreference system. The results suggest that our model can extract from crowdsourced annotations coreference chains of comparable quality to those obtained with expert annotation.- Anthology ID:
- D18-1218
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1926–1937
- Language:
- URL:
- https://aclanthology.org/D18-1218
- DOI:
- 10.18653/v1/D18-1218
- Cite (ACL):
- Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, and Massimo Poesio. 2018. A Probabilistic Annotation Model for Crowdsourcing Coreference. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1926–1937, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- A Probabilistic Annotation Model for Crowdsourcing Coreference (Paun et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/D18-1218.pdf