Deepawali Sharma


2026

Social media is now an important platform for communication and interaction. At the same time, the amount of abusive and harmful content online has also increased. Offensive language and hate speech are making these platforms less safe and less welcoming for users. Many of these contents include homophobic and transphobic remarks aimed at the LGBT+ community. Such behaviour damages healthy discussions and can negatively affect users. For this reason, it is important to detect these contents early so they can be flagged and removed to maintain a healthy online well-being. The issue becomes more difficult when harmful messages appear in popular formats like memes. Memes are widely used by younger users to communicate online. Because they combine images and text, detecting offensive meaning becomes challenging. In this work, we attempt to address this problem. We develop a method to identify such content using the meme dataset released for the LT-EDI 2026 challenge and secured rank 5 in the shared task. We propose a Zero-shot learning based method employing two LLMs (Qwen2.5-VL-3B-Instruct and Meta-Llama-3-8B-Instruct) to generate descriptions and classify such memes. We achieved a macro F1-score of 0.55 for the English language meme.
The rapid growth of social media has also led to a rise in abusive and harmful content, which negatively affects the online environment for users. The frequent use of offensive language and hate speech contributes to making these platforms increasingly hostile. In particular, homophobic and transphobic remarks target members of the LGBT+ community. Detecting such comments is therefore essential so that they can be flagged promptly and appropriate warnings can be given to users involved in such behaviour. The problem becomes more serious when such content appears in other forms of communication used by younger generations, such as memes. This work tries to address this issue. We propose a method to detect such content using the meme dataset from the LT-EDI 2026 challenge and secured 8th rank for English and 6th rank for Chinese language dataset in the shared task. Our approach uses a multimodal technique that processes both image and text information. The dataset has limited data, which creates a challenge. To handle this, we pre–fine-tune the models on a similar dataset called PrideMM. The proposed multimodal approach achieved Macro F1-scores of 0.24 and 0.57 for English and Chinese memes respectively.

2025

In the age of digital communication, social media platforms have become a medium for the spread of misinformation, with racial hoaxes posing a particularly insidious threat. These hoaxes falsely associate individuals or communities with crimes or misconduct, perpetuating harmful stereotypes and inflaming societal tensions. This paper describes the team “Hope_for_best” submission that addresses the challenge of detecting racial hoaxes in codemixed Hindi-English (Hinglish) social media content and secured the 2nd rank in the shared task (Chakravarthi et al., 2025). To address this challenge, the study employs the HoaxMix Plus dataset, developed by LT-EDI 2025, and adopts a multi-phase fine-tuning strategy. Initially, models are sensitized using the THAR dataset—targeted hate speech against religion (Sharma et al., 2024) —to adjust weights toward contextually relevant biases. Further fine-tuning was performed on the HoaxMix Plus dataset. This work employed data balancing sampling strategies to mitigate class imbalance. Among the evaluated models, Hing BERT achieved the highest macro F1-score of 73% demonstrating promising capabilities in detecting racially charged misinformation in code-mixed Hindi-English texts.