Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution
David Q. Sun, Hadas Kotek, Christopher Klein, Mayank Gupta, William Li, Jason D. Williams
Abstract
This paper develops and implements a scalable methodology for (a) estimating the noisiness of labels produced by a typical crowdsourcing semantic annotation task, and (b) reducing the resulting error of the labeling process by as much as 20-30% in comparison to other common labeling strategies. Importantly, this new approach to the labeling process, which we name Dynamic Automatic Conflict Resolution (DACR), does not require a ground truth dataset and is instead based on inter-project annotation inconsistencies. This makes DACR not only more accurate but also available to a broad range of labeling tasks. In what follows we present results from a text classification task performed at scale for a commercial personal assistant, and evaluate the inherent ambiguity uncovered by this annotation strategy as compared to other common labeling strategies.- Anthology ID:
- 2020.coling-main.316
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 3547–3557
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2020.coling-main.316/
- DOI:
- 10.18653/v1/2020.coling-main.316
- Cite (ACL):
- David Q. Sun, Hadas Kotek, Christopher Klein, Mayank Gupta, William Li, and Jason D. Williams. 2020. Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3547–3557, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution (Sun et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2020.coling-main.316.pdf