CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles

Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu


Abstract
Crowdsourcing has proven to be an effective method for generating labeled data for a range of NLP tasks. However, multiple recent attempts of using crowdsourcing to generate gold-labeled training data for semantic role labeling (SRL) reported only modest results, indicating that SRL is perhaps too difficult a task to be effectively crowdsourced. In this paper, we postulate that while producing SRL annotation does require expert involvement in general, a large subset of SRL labeling tasks is in fact appropriate for the crowd. We present a novel workflow in which we employ a classifier to identify difficult annotation tasks and route each task either to experts or crowd workers according to their difficulties. Our experimental evaluation shows that the proposed approach reduces the workload for experts by over two-thirds, and thus significantly reduces the cost of producing SRL annotation at little loss in quality.
Anthology ID:
D17-1205
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1913–1922
Language:
URL:
https://aclanthology.org/D17-1205
DOI:
10.18653/v1/D17-1205
Bibkey:
Cite (ACL):
Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, and Anbang Xu. 2017. CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1913–1922, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles (Wang et al., EMNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-dup-bibkey/D17-1205.pdf