Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection

Zhengyuan Liu; Hai Leong Chieu; Nancy Chen

doi:10.18653/v1/2023.findings-emnlp.174

Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection

Zhengyuan Liu, Hai Leong Chieu, Nancy Chen

Abstract

Data collection from manual labeling provides domain-specific and task-aligned supervision for data-driven approaches, and a critical mass of well-annotated resources is required to achieve reasonable performance in natural language processing tasks. However, manual annotations are often challenging to scale up in terms of time and budget, especially when domain knowledge, capturing subtle semantic features, and reasoning steps are needed. In this paper, we investigate the efficacy of leveraging large language models on automated labeling for computational stance detection. We empirically observe that while large language models show strong potential as an alternative to human annotators, their sensitivity to task-specific instructions and their intrinsic biases pose intriguing yet unique challenges in machine annotation. We introduce a multi-label and multi-target sampling strategy to optimize the annotation quality. Experimental results on the benchmark stance detection corpora show that our method can significantly improve performance and learning efficacy.

Anthology ID:: 2023.findings-emnlp.174
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2641–2649
Language:
URL:: https://aclanthology.org/2023.findings-emnlp.174
DOI:: 10.18653/v1/2023.findings-emnlp.174
Bibkey:
Cite (ACL):: Zhengyuan Liu, Hai Leong Chieu, and Nancy Chen. 2023. Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 2641–2649, Singapore. Association for Computational Linguistics.
Cite (Informal):: Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection (Liu et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/naacl24-info/2023.findings-emnlp.174.pdf

PDF Search