Abstract
We present the winning approach to the TRAC 2024 Shared Task on Offline Harm Potential Identification (HarmPot-ID). The task focused on low-resource Indian languages and consisted of two sub-tasks: 1a) predicting the offline harm potential and 1b) detecting the most likely target(s) of the offline harm. We explored low-source domain specific, cross-lingual, and monolingual transformer models and submitted the aggregate predictions from the MuRIL and BERT models. Our approach achieved 0.74 micro-averaged F1-score for sub-task 1a and 0.96 for sub-task 1b, securing the 1st rank for both sub-tasks in the competition.- Anthology ID:
- 2024.trac-1.3
- Volume:
- Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Bharathi Raja Chakravarthi, Bornini Lahiri, Siddharth Singh, Shyam Ratan
- Venues:
- TRAC | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 21–26
- Language:
- URL:
- https://aclanthology.org/2024.trac-1.3
- DOI:
- Cite (ACL):
- Yeshan Wang and Ilia Markov. 2024. CLTL@HarmPot-ID: Leveraging Transformer Models for Detecting Offline Harm Potential and Its Targets in Low-Resource Languages. In Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024, pages 21–26, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- CLTL@HarmPot-ID: Leveraging Transformer Models for Detecting Offline Harm Potential and Its Targets in Low-Resource Languages (Wang & Markov, TRAC-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.trac-1.3.pdf