Span Extraction Aided Improved Code-mixed Sentiment Classification

Ramaneswaran S, Sean Benhur, Sreyan Ghosh


Abstract
Sentiment classification is a fundamental NLP task of detecting the sentiment polarity of a given text. In this paper we show how solving sentiment span extraction as an auxiliary task can help improve final sentiment classification performance in a low-resource code-mixed setup. To be precise, we don’t solve a simple multi-task learning objective, but rather design a unified transformer framework that exploits the bidirectional connection between the two tasks simultaneously. To facilitate research in this direction we release gold-standard human-annotated sentiment span extraction dataset for Tamil-english code-switched texts. Extensive experiments and strong baselines show that our proposed approach outperforms sentiment and span prediction by 1.27% and 2.78% respectively when compared to the best performing MTL baseline. We also establish the generalizability of our approach on the Twitter Sentiment Extraction dataset. We make our code and data publicly available on GitHub
Anthology ID:
2022.wnut-1.18
Volume:
Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022)
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
162–170
Language:
URL:
https://aclanthology.org/2022.wnut-1.18
DOI:
Bibkey:
Cite (ACL):
Ramaneswaran S, Sean Benhur, and Sreyan Ghosh. 2022. Span Extraction Aided Improved Code-mixed Sentiment Classification. In Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022), pages 162–170, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Span Extraction Aided Improved Code-mixed Sentiment Classification (S et al., WNUT 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.wnut-1.18.pdf
Code
 ramaneswaran/codemixed_sentiment_span_extraction