Dheeraj Kodati
2026
POLAR: A Benchmark for Multilingual, Multicultural, and Multi-Event Online Polarization
Usman Naseem | Robert Geislinger | Juan Ren | Sarah Kohail | Rudy Alexandro Garrido Veliz | P Sam Sahil | Yiran Zhang | Idris Abdulmumin | Marco Antonio Stranisci | \"Ozge Alacam | Cengiz Acarturk | Aisha Jabr | Saba Anwar | Abinew Ali Ayele | Simona Frenda | Alessandra Teresa Cignarella | Elena Tutubalina | Oleg Rogov | Aung Kyaw Htet | Xintong Wang | Surendrabikram Thapa | Kritesh Rauniyar | Tanmoy Chakraborty | MD Arfeen Zeeshan | Dheeraj Kodati | Satya Keerthi | Sahar Moradizeyveh | Firoj Alam | Md Arid Hasan | Syed Ishtiaque Ahmed | Ye Kyaw Thu | Shantipriya Parida | Ihsan Ayyub Qazi | Lilian Diana Awuor Wanzare | Nelson Odhiambo Onyango | Clemencia Siro | Jane Wanjiru Kimani | Ibrahim Said Ahmad | Adem Chanie Ali | Martin Semmann | Chris Biemann | Shamsuddeen Hassan Muhammad | Seid Muhie Yimam
Findings of the Association for Computational Linguistics: ACL 2026
Usman Naseem | Robert Geislinger | Juan Ren | Sarah Kohail | Rudy Alexandro Garrido Veliz | P Sam Sahil | Yiran Zhang | Idris Abdulmumin | Marco Antonio Stranisci | \"Ozge Alacam | Cengiz Acarturk | Aisha Jabr | Saba Anwar | Abinew Ali Ayele | Simona Frenda | Alessandra Teresa Cignarella | Elena Tutubalina | Oleg Rogov | Aung Kyaw Htet | Xintong Wang | Surendrabikram Thapa | Kritesh Rauniyar | Tanmoy Chakraborty | MD Arfeen Zeeshan | Dheeraj Kodati | Satya Keerthi | Sahar Moradizeyveh | Firoj Alam | Md Arid Hasan | Syed Ishtiaque Ahmed | Ye Kyaw Thu | Shantipriya Parida | Ihsan Ayyub Qazi | Lilian Diana Awuor Wanzare | Nelson Odhiambo Onyango | Clemencia Siro | Jane Wanjiru Kimani | Ibrahim Said Ahmad | Adem Chanie Ali | Martin Semmann | Chris Biemann | Shamsuddeen Hassan Muhammad | Seid Muhie Yimam
Findings of the Association for Computational Linguistics: ACL 2026
Online polarization poses a growing challenge for democratic discourse, yet most computational social science research remains monolingual, culturally narrow, or event-specific. We introduce POLAR, a multilingual, multicultural, and multi-event dataset with over 110K instances in 22 languages drawn from diverse online platforms and real-world events. Polarization is annotated along three axes, namely detection, type, and manifestation, using a variety of annotation platforms adapted to each cultural context. We conduct two main experiments: (1) fine-tuning six pretrained small language models; and (2) evaluating a range of open and closed large language models in few-shot and zero-shot settings. Results show that while most models perform well on binary polarization detection, they achieve substantially lower performance when predicting polarization types and manifestations. These findings highlight the complex, highly contextual nature of polarization and underscore the need for robust, adaptable approaches in NLP and computational social science. All resources will be released to support further research and effective mitigation of digital polarization globally.
2025
Identifying Contextual Triggers in Hate Speech Texts Using Explainable Large Language Models
Dheeraj Kodati | Bhuvana Sree Lakkireddy
Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
Dheeraj Kodati | Bhuvana Sree Lakkireddy
Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
The pervasive spread of hate speech on online platforms poses a significant threat to social harmony, necessitating not only high-performing classifiers but also models capable of transparent, fine-grained interpretability. Existing methods often neglect the identification of influential contextual words that drive hate speech classification, limiting their reliability in high-stakes applications. To address this, we propose LLM-BiMACNet (Large Language Model-based Bidirectional Multi-Channel Attention Classification Network), an explainability-focused architecture that leverages pretrained language models and supervised attention to highlight key lexical indicators of hateful and offensive intent. Trained and evaluated on the HateXplain benchmark—comprising class labels, target community annotations, and human-labeled rationales—LLM-BiMACNet is optimized to simultaneously enhance both predictive performance and rationale alignment. Experimental results demonstrate that our model outperforms existing state-of-the-art approaches, achieving an accuracy of 87.3 %, AUROC of 0.881, token-level F1 of 0.553, IOU-F1 of 0.261, AUPRC of 0.874, and comprehensiveness of 0.524, thereby offering highly interpretable and accurate hate speech detection.
Search
Fix author
Co-authors
- Idris Abdulmumin 1
- Cengiz Acarturk 1
- Ibrahim Said Ahmad 1
- Syed Ishtiaque Ahmed 1
- Özge Alacam 1
- Firoj Alam 1
- Adem Chanie Ali 1
- Saba Anwar 1
- Abinew Ali Ayele 1
- Chris Biemann 1
- Tanmoy Chakraborty 1
- Alessandra Teresa Cignarella 1
- Simona Frenda 1
- Robert Geislinger 1
- Md. Arid Hasan 1
- Aung Kyaw Htet 1
- Aisha Jabr 1
- Satya Keerthi 1
- Jane Wanjiru Kimani 1
- Sarah Kohail 1
- Bhuvana Sree Lakkireddy 1
- Sahar Moradizeyveh 1
- Shamsuddeen Hassan Muhammad 1
- Usman Naseem 1
- Nelson Odhiambo Onyango 1
- Shantipriya Parida 1
- Ihsan Ayyub Qazi 1
- Kritesh Rauniyar 1
- Juan Ren 1
- Oleg Rogov 1
- P Sam Sahil 1
- Martin Semmann 1
- Clemencia Siro 1
- Marco Antonio Stranisci 1
- Surendrabikram Thapa 1
- Ye Kyaw Thu 1
- Elena Tutubalina 1
- Rudy Alexandro Garrido Veliz 1
- Xintong Wang 1
- Lilian Diana Awuor Wanzare 1
- Seid Muhie Yimam 1
- MD Arfeen Zeeshan 1
- Yiran Zhang 1