Surabhi Adhikari

2026

Vaccination-related memes on social media play an increasingly influential role in shaping public perception of immunization, often spreading both supportive messaging and vaccine-critical narratives through multimodal communication. Detecting such content is challenging due to the combined use of images, embedded text, sarcasm, humor, and cultural references. This paper presents an overview of the Shared Task on Multimodal Identification of Vaccine Critical Content on Social Media, organized as part of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026) at ACL 2026. The task is based on the VaxMeme dataset, a large-scale collection of vaccination-related memes annotated into three classes: Vaccine-critical, Neutral, and Pro-vaccine. A total of 77 participants registered for the competition, with 25 teams submitting systems for evaluation. Participating approaches included transformer-based multimodal architectures, vision-language models, ensemble methods, and instruction-tuned large language models. The best-performing system achieved a macro F1-score of 0.8494. This shared task provides insights into the strengths and limitations of current multimodal approaches for vaccine stance detection and highlights future directions for robust public health misinformation analysis.

pdf bib abs

Online gaming communities are increasingly affected by toxic communication, including harassment, threats, hate speech, and extremist content. Detecting such behavior is challenging due to the short, noisy, multilingual, and highly imbalanced nature of gaming chat data. To advance research in this area, we organized the Shared Task on Fine-Grained Toxicity Detection in Online Gaming at EEUCA 2026, co-located with ACL 2026. The task is based on the GameTox dataset, containing approximately 53,000 annotated chat utterances from World of Tanks across six toxicity categories. A total of 102 participants took part, and 35 teams submitted systems exploring approaches such as domain-adaptive pretraining, multilingual transfer learning, contrastive learning, LLM-based augmentation, and ensemble methods. Systems were evaluated using macro-averaged F1-score, with the top system achieving 0.7041 Macro F1. This paper presents an overview of the shared task, dataset, evaluation framework, participant methods, and key findings.

pdf bib abs

Overview of the Workshop on Event Extraction and Understanding: Challenges and Applications
Ali Hürriyetoğlu | Surendrabikram Thapa | Hristo Tanev | Laxmi Thapa | Surabhi Adhikari
Proceedings of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026)

This paper presents an overview of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026), held in conjunction with ACL 2026. Formerly known as CASE, the workshop continues its mission of bringing together researchers from natural language processing, machine learning, computational social science, and related disciplines to advance research on event extraction and understanding. This year’s edition particularly emphasized the growing influence of large language models (LLMs), multimodal learning, and weakly supervised methodologies in event extraction research. The workshop featured six regular research papers covering topics such as low-resource event extraction, reflective multi-agent architectures, symbolic auditing of procedural events, geopolitical event extraction, and generative event extraction strategies. In addition, EEUCA 2026 hosted two shared tasks focusing on toxicity detection in gaming communities and multimodal vaccine-critical meme analysis, attracting broad international participation and encouraging research on socially impactful applications of AI. The workshop highlights current advances, emerging challenges, and future directions in multilingual, multimodal, and socially aware event extraction systems.

2025

pdf bib abs

Probing the Limits of Multilingual Language Understanding: Low-Resource Language Proverbs as LLM Benchmark for AI Wisdom
Surendrabikram Thapa | Kritesh Rauniyar | Hariram Veeramani | Surabhi Adhikari | Imran Razzak | Usman Naseem
Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025)

Understanding and interpreting culturally specific language remains a significant challenge for multilingual natural language processing (NLP) systems, particularly for less-resourced languages. To address this problem, this paper introduces PRONE, a novel dataset of 2,830 Nepali proverbs, and evaluates the performance of various language models (LMs) in two tasks: (i) identifying the correct meaning of a proverb from multiple choices, and (ii) categorizing proverbs into predefined thematic categories. The models, including both open-source and proprietary, were tested in zero-shot and few-shot settings with prompts in English and Nepali. While models like GPT-4o demonstrated promising results and achieved the highest performance among LMs, they still fall short of human-level accuracy in understanding and categorizing culturally nuanced content, highlighting the need for more inclusive NLP.

pdf bib abs

Natural Language Understanding of Devanagari Script Languages: Language Identification, Hate Speech and its Target Detection
Surendrabikram Thapa | Kritesh Rauniyar | Farhan Ahmad Jafri | Surabhi Adhikari | Kengatharaiyer Sarveswaran | Bal Krishna Bal | Hariram Veeramani | Usman Naseem
Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)

The growing use of Devanagari-script languages such as Hindi, Nepali, Marathi, Sanskrit, and Bhojpuri on social media presents unique challenges for natural language understanding (NLU), particularly in language identification, hate speech detection, and target classification. To address these challenges, we organized a shared task with three subtasks: (i) identifying the language of Devanagari-script text, (ii) detecting hate speech, and (iii) classifying hate speech targets into individual, community, or organization. A curated dataset combining multiple corpora was provided, with splits for training, evaluation, and testing. The task attracted 113 participants, with 32 teams submitting models evaluated on accuracy, precision, recall, and macro F1-score. Participants applied innovative methods, including large language models, transformer models, and multilingual embeddings, to tackle the linguistic complexities of Devanagari-script languages. This paper summarizes the shared task, datasets, and results, and aims to contribute to advancing NLU for low-resource languages and fostering inclusive, culturally aware natural language processing (NLP) solutions.

pdf bib abs

This paper presents the Shared Task on Multimodal Detection of Hate Speech, Humor, and Stance in Marginalized Socio-Political Movement Discourse, hosted at CASE 2025. The task is built on the PrideMM dataset, a curated collection of 5,063 text-embedded images related to the LGBTQ+ pride movement, annotated for four interrelated subtasks: (A) Hate Speech Detection, (B) Hate Target Classification, (C) Topical Stance Classification, and (D) Intended Humor Detection. Eighty-nine teams registered, with competitive submissions across all subtasks. The results show that multimodal approaches consistently outperform unimodal baselines, particularly for hate speech detection, while fine-grained tasks such as target identification and stance classification remain challenging due to label imbalance, multimodal ambiguity, and implicit or culturally specific content. CLIP-based models and parameter-efficient fusion architectures achieved strong performance, showing promising directions for low-resource and efficient multimodal systems.

pdf bib abs

Challenges and Applications of Automated Extraction of Socio-political Events at the age of Large Language Models
Surendrabikram Thapa | Surabhi Adhikari | Hristo Tanev | Ali Hurriyetoglu
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts

Socio-political event extraction (SPE) enables automated identification of critical events such as protests, conflicts, and policy shifts from unstructured text. As a foundational tool for journalism, social science research, and crisis response, SPE plays a key role in understanding complex global dynamics. The emergence of large language models (LLMs) like GPT-4 and LLaMA offers new opportunities for flexible, multilingual, and zero-shot SPE. However, applying LLMs to this domain introduces significant risks, including hallucinated outputs, lack of transparency, geopolitical bias, and potential misuse in surveillance or censorship. This position paper critically examines the promises and pitfalls of LLM-driven SPE, drawing on recent datasets and benchmarks. We argue that SPE is a high-stakes application requiring rigorous ethical scrutiny, interdisciplinary collaboration, and transparent design practices. We propose a research agenda focused on reproducibility, participatory development, and building systems that align with democratic values and the rights of affected communities.

pdf bib abs

MLInitiative at CASE 2025: Multimodal Detection of Hate Speech, Humor,and Stance using Transformers
Ashish Acharya | Ankit Bk | Bikram K.c. | Surabhi Adhikari | Rabin Thapa | Sandesh Shrestha | Tina Lama
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts

In recent years, memes have developed as popular forms of online satire and critique, artfully merging entertainment, social critique, and political discourse. On the other side, memes have also become a medium for the spread of hate speech, misinformation, and bigotry, especially towards marginalized communities, including the LGBTQ+ population. Solving this problem calls for the development of advanced multimodal systems that analyze the complex interplay between text and visuals in memes. This paper describes our work in the CASE@RANLP 2025 shared task. As a part of that task, we developed systems for hate speech detection, target identification, stance classification, and humor recognition within the text of memes. We investigate two multimodal transformer-based systems, ResNet-18 with BERT and SigLIP2, for these sub-tasks. Our results show that SigLIP-2 consistently outperforms the baseline, achieving an F1 score of 79.27 in hate speech detection, 72.88 in humor classification, and competitive performance in stance 60.59 and target detection 54.86. Through this study, we aim to contribute to the development of ethically grounded, inclusive NLP systems capable of interpreting complex sociolinguistic narratives in multi-modal content.

pdf bib abs

Findings and Insights from the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text
Ali Hurriyetoglu | Surendrabikram Thapa | Hristo Tanev | Surabhi Adhikari
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts

This paper presents an overview of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), held in conjunction with RANLP 2025. The workshop featured a range of contributions, including regular research papers, system descriptions from shared task participants, and an overview paper on shared task outcomes. Continuing its tradition, CASE brings together researchers from computational and social sciences to explore the evolving landscape of event extraction. With the rapid advancement of large language models (LLMs), this year’s edition placed particular emphasis on their application to socio-political event extraction. Alongside text-based approaches, the workshop also highlighted the growing interest in multimodal event extraction, addressing complex real-world scenarios across diverse modalities.

pdf bib

Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts
Ali Hürriyetoğlu | Hristo Tanev | Surendrabikram Thapa | Surabhi Adhikari
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts

Venues

Fix author