Birudugadda Srivibhav
2025
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
Himanshu Beniwal
|
Sailesh Panda
|
Birudugadda Srivibhav
|
Mayank Singh
Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
We explore Cross-lingual Backdoor ATtacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare and high-occurring tokens serving as specific, effective triggers. Our findings reveal a critical vulnerability that affects the model’s architecture, leading to a concealed backdoor effect during the information flow. Our code and data are publicly available at https://github.com/himanshubeniwal/X-BAT.
UnityAI Guard: Pioneering Toxicity Detection Across Low-Resource Indian Languages
Himanshu Beniwal
|
Reddybathuni Venkat
|
Rohit Kumar
|
Birudugadda Srivibhav
|
Daksh Jain
|
Pavan Deekshith Doddi
|
Eshwar Dhande
|
Adithya Ananth
|
Kuldeep
|
Mayank Singh
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 567k training instances and 30k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.
Search
Fix author
Co-authors
- Himanshu Beniwal 2
- Mayank Singh 2
- Adithya Ananth 1
- Eshwar Dhande 1
- Pavan Deekshith Doddi 1
- show all...