Multilingual Bias Detection and Mitigation for Indian Languages
Ankita Maity, Anubhav Sharma, Rudra Dhar, Tushar Abhishek, Manish Gupta, Vasudeva Varma
Abstract
Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWIKIBIAS and mWNC, covering 8 languages, for the bias detection and mitigation tasks respectively. Next, we investigate the effectiveness of popular multilingual Transformer-based models for the two tasks by modeling detection as a binary classification problem and mitigation as a style transfer problem. We make the code and data publicly available.- Anthology ID:
- 2024.wildre-1.4
- Volume:
- Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Girish Nath Jha, Sobha L., Kalika Bali, Atul Kr. Ojha
- Venues:
- WILDRE | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 24–29
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.wildre-1.4/
- DOI:
- Cite (ACL):
- Ankita Maity, Anubhav Sharma, Rudra Dhar, Tushar Abhishek, Manish Gupta, and Vasudeva Varma. 2024. Multilingual Bias Detection and Mitigation for Indian Languages. In Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation, pages 24–29, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Multilingual Bias Detection and Mitigation for Indian Languages (Maity et al., WILDRE 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.wildre-1.4.pdf