Birudugadda Srivibhav


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
Himanshu Beniwal | Sailesh Panda | Birudugadda Srivibhav | Mayank Singh
Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP

We explore Cross-lingual Backdoor ATtacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare and high-occurring tokens serving as specific, effective triggers. Our findings reveal a critical vulnerability that affects the model’s architecture, leading to a concealed backdoor effect during the information flow. Our code and data are publicly available at https://github.com/himanshubeniwal/X-BAT.

pdf bib
UnityAI Guard: Pioneering Toxicity Detection Across Low-Resource Indian Languages
Himanshu Beniwal | Reddybathuni Venkat | Rohit Kumar | Birudugadda Srivibhav | Daksh Jain | Pavan Deekshith Doddi | Eshwar Dhande | Adithya Ananth | Kuldeep | Mayank Singh
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 567k training instances and 30k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.