Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning

Aditya Narayan Sankaran; Reza Farahbakhsh; Noel Crespi

Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning

Aditya Narayan Sankaran, Reza Farahbakhsh, Noel Crespi

Abstract

Online abusive content detection, particularly in low-resource settings and within the audio modality, remains underexplored. We investigate the potential of pre-trained audio representations for detecting abusive language in low-resource languages, in this case, in Indian languages using Few Shot Learning (FSL). Leveraging powerful representations from models such as Wav2Vec and Whisper, we explore cross-lingual abuse detection using the ADIMA dataset with FSL. Our approach integrates these representations within the Model-Agnostic Meta-Learning (MAML) framework to classify abusive language in 10 languages. We experiment with various shot sizes (50-200) evaluating the impact of limited data on performance. Additionally, a feature visualization study was conducted to better understand model behaviour. This study highlights the generalization ability of pre-trained models in low-resource scenarios and offers valuable insights into detecting abusive language in multilingual contexts.

Anthology ID:: 2025.coling-main.373
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5558–5569
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.373/
DOI:
Bibkey:
Cite (ACL):: Aditya Narayan Sankaran, Reza Farahbakhsh, and Noel Crespi. 2025. Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5558–5569, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning (Sankaran et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.373.pdf

PDF Cite Search Fix data