Abstract
We present a systematic procedure for interrater disagreement resolution. The procedure is general, but of particular use in multiple-annotator tasks geared towards ground truth construction. We motivate our proposal by arguing that, barring cases in which the researchers’ goal is to elicit different viewpoints, interrater disagreement is a sign of poor quality in the design or the description of a task. Consensus among annotators, we maintain, should be striven for, through a systematic procedure for disagreement resolution such as the one we describe.- Anthology ID:
- 2021.humeval-1.15
- Volume:
- Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Venue:
- HumEval
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 131–141
- Language:
- URL:
- https://aclanthology.org/2021.humeval-1.15
- DOI:
- Cite (ACL):
- Yvette Oortwijn, Thijs Ossenkoppele, and Arianna Betti. 2021. Interrater Disagreement Resolution: A Systematic Procedure to Reach Consensus in Annotation Tasks. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval), pages 131–141, Online. Association for Computational Linguistics.
- Cite (Informal):
- Interrater Disagreement Resolution: A Systematic Procedure to Reach Consensus in Annotation Tasks (Oortwijn et al., HumEval 2021)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2021.humeval-1.15.pdf
- Code
- yoortwijn/humevaldisres