UMUTeam@LT-EDI-ACL2022: Detecting homophobic and transphobic comments in Tamil
José García-Díaz, Camilo Caparros-Laiz, Rafael Valencia-García
Abstract
This working-notes are about the participation of the UMUTeam in a LT-EDI shared task concerning the identification of homophobic and transphobic comments in YouTube. These comments are written in English, which has high availability to machine-learning resources; Tamil, which has fewer resources; and a transliteration from Tamil to Roman script combined with English sentences. To carry out this shared task, we train a neural network that combines several feature sets applying a knowledge integration strategy. These features are linguistic features extracted from a tool developed by our research group and contextual and non-contextual sentence embeddings. We ranked 7th for English subtask (macro f1-score of 45%), 3rd for Tamil subtask (macro f1-score of 82%), and 2nd for Tamil-English subtask (macro f1-score of 58%).- Anthology ID:
- 2022.ltedi-1.16
- Volume:
- Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Bharathi Raja Chakravarthi, B Bharathi, John P McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
- Venue:
- LTEDI
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 140–144
- Language:
- URL:
- https://aclanthology.org/2022.ltedi-1.16
- DOI:
- 10.18653/v1/2022.ltedi-1.16
- Cite (ACL):
- José García-Díaz, Camilo Caparros-Laiz, and Rafael Valencia-García. 2022. UMUTeam@LT-EDI-ACL2022: Detecting homophobic and transphobic comments in Tamil. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 140–144, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- UMUTeam@LT-EDI-ACL2022: Detecting homophobic and transphobic comments in Tamil (García-Díaz et al., LTEDI 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.ltedi-1.16.pdf