Dependency Tree Annotation with Mechanical Turk

Stephen Tratz


Abstract
Crowdsourcing is frequently employed to quickly and inexpensively obtain valuable linguistic annotations but is rarely used for parsing, likely due to the perceived difficulty of the task and the limited training of the available workers. This paper presents what is, to the best of our knowledge, the first published use of Mechanical Turk (or similar platform) to crowdsource parse trees. We pay Turkers to construct unlabeled dependency trees for 500 English sentences using an interactive graphical dependency tree editor, collecting 10 annotations per sentence. Despite not requiring any training, several of the more prolific workers meet or exceed 90% attachment agreement with the Penn Treebank (PTB) portion of our data, and, furthermore, for 72% of these PTB sentences, at least one Turker produces a perfect parse. Thus, we find that, supported with a simple graphical interface, people with presumably no prior experience can achieve surprisingly high degrees of accuracy on this task. To facilitate research into aggregation techniques for complex crowdsourced annotations, we publicly release our annotated corpus.
Anthology ID:
D19-5901
Volume:
Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP
Month:
November
Year:
2019
Address:
Hong Kong
Editors:
Silviu Paun, Dirk Hovy
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–5
Language:
URL:
https://aclanthology.org/D19-5901
DOI:
10.18653/v1/D19-5901
Bibkey:
Cite (ACL):
Stephen Tratz. 2019. Dependency Tree Annotation with Mechanical Turk. In Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP, pages 1–5, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Dependency Tree Annotation with Mechanical Turk (Tratz, 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/D19-5901.pdf