Developing a tool for fair and reproducible use of paid crowdsourcing in the digital humanities

Tuomo Hiippala, Helmiina Hotti, Rosa Suviranta


Abstract
This system demonstration paper describes ongoing work on a tool for fair and reproducible use of paid crowdsourcing in the digital humanities. Paid crowdsourcing is widely used in natural language processing and computer vision, but has been rarely applied in the digital humanities due to ethical concerns. We discuss concerns associated with paid crowdsourcing and describe how we seek to mitigate them in designing the tool and crowdsourcing pipelines. We demonstrate how the tool may be used to create annotations for diagrams, a complex mode of expression whose description requires human input.
Anthology ID:
2022.latechclfl-1.2
Volume:
Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Stefania Degaetano, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
Venue:
LaTeCHCLfL
SIG:
SIGHUM
Publisher:
International Conference on Computational Linguistics
Note:
Pages:
7–12
Language:
URL:
https://aclanthology.org/2022.latechclfl-1.2
DOI:
Bibkey:
Cite (ACL):
Tuomo Hiippala, Helmiina Hotti, and Rosa Suviranta. 2022. Developing a tool for fair and reproducible use of paid crowdsourcing in the digital humanities. In Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 7–12, Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
Cite (Informal):
Developing a tool for fair and reproducible use of paid crowdsourcing in the digital humanities (Hiippala et al., LaTeCHCLfL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2022.latechclfl-1.2.pdf
Code
 thiippal/abulafia +  additional community code
Data
AI2D-RST