OptSLA: an Optimization-Based Approach for Sequential Label Aggregation

Nasim Sabetpour, Adithya Kulkarni, Qi Li


Abstract
The need for the annotated training dataset on which data-hungry machine learning algorithms feed has increased dramatically with advanced acclaim of machine learning applications. To annotate the data, people with domain expertise are needed, but they are seldom available and expensive to hire. This has lead to the thriving of crowdsourcing platforms such as Amazon Mechanical Turk (AMT). However, the annotations provided by one worker cannot be used directly to train the model due to the lack of expertise. Existing literature in annotation aggregation focuses on binary and multi-choice problems. In contrast, little work has been done on complex tasks such as sequence labeling with imbalanced classes, a ubiquitous task in Natural Language Processing (NLP), and Bio-Informatics. We propose OptSLA, an Optimization-based Sequential Label Aggregation method, that jointly considers the characteristics of sequential labeling tasks, workers reliabilities, and advanced deep learning techniques to conquer the challenge. We evaluate our model on crowdsourced data for named entity recognition task. Our results show that the proposed OptSLA outperforms the state-of-the-art aggregation methods, and the results are easier to interpret.
Anthology ID:
2020.findings-emnlp.119
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1335–1340
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.119
DOI:
10.18653/v1/2020.findings-emnlp.119
Bibkey:
Cite (ACL):
Nasim Sabetpour, Adithya Kulkarni, and Qi Li. 2020. OptSLA: an Optimization-Based Approach for Sequential Label Aggregation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1335–1340, Online. Association for Computational Linguistics.
Cite (Informal):
OptSLA: an Optimization-Based Approach for Sequential Label Aggregation (Sabetpour et al., Findings 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.findings-emnlp.119.pdf
Video:
 https://slideslive.com/38940107
Code
 NasimISU/OptSLA