Hopeful Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers

Ishan Sanjeev Upadhyay, Nikhil E, Anshul Wadhawan, Radhika Mamidi


Abstract
This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SVM, and LSTM based models. The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained transformer models (BERT, ALBERT, RoBERTa, IndicBERT) after adding an output layer. We found that the second approach was superior for English, Tamil and Malayalam. Our solution got a weighted F1 score of 0.93, 0.75 and 0.49 for English, Malayalam and Tamil respectively. Our solution ranked 1st in English, 8th in Malayalam and 11th in Tamil.
Anthology ID:
2021.ltedi-1.23
Volume:
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
April
Year:
2021
Address:
Kyiv
Venues:
EACL | LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
157–163
Language:
URL:
https://aclanthology.org/2021.ltedi-1.23
DOI:
Bibkey:
Cite (ACL):
Ishan Sanjeev Upadhyay, Nikhil E, Anshul Wadhawan, and Radhika Mamidi. 2021. Hopeful Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers. In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion, pages 157–163, Kyiv. Association for Computational Linguistics.
Cite (Informal):
Hopeful Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers (Upadhyay et al., LTEDI 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.ltedi-1.23.pdf
Dataset:
 2021.ltedi-1.23.Dataset.txt
Software:
 2021.ltedi-1.23.Software.zip
Data
HopeEDI