Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition
Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie
Abstract
Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers. Previous studies focus on reducing the influences from the noises of the crowdsourced annotations for supervised models. We take a different point in this work, regarding all crowdsourced annotations as gold-standard with respect to the individual annotators. In this way, we find that crowdsourcing could be highly similar to domain adaptation, and then the recent advances of cross-domain methods can be almost directly applied to crowdsourcing. Here we take named entity recognition (NER) as a study case, suggesting an annotator-aware representation learning model that inspired by the domain adaptation methods which attempt to capture effective domain-aware features. We investigate both unsupervised and supervised crowdsourcing learning, assuming that no or only small-scale expert annotations are available. Experimental results on a benchmark crowdsourced NER dataset show that our method is highly effective, leading to a new state-of-the-art performance. In addition, under the supervised setting, we can achieve impressive performance gains with only a very small scale of expert annotations.- Anthology ID:
- 2021.acl-long.432
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5558–5570
- Language:
- URL:
- https://aclanthology.org/2021.acl-long.432
- DOI:
- 10.18653/v1/2021.acl-long.432
- Cite (ACL):
- Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, and Pengjun Xie. 2021. Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5558–5570, Online. Association for Computational Linguistics.
- Cite (Informal):
- Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition (Zhang et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2021.acl-long.432.pdf
- Code
- izhx/CLasDA
- Data
- CoNLL 2003