This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
DolaChakraborty
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
People often use written words to spread hate aimed at different groups that cannot be practically detected manually. Therefore, developing an automatic system capable of identifying hate speech is crucial. However, creating such a system in a low-resourced languages (LRLs) script like Devanagari becomes challenging. Hence, the Devanagari script has organized a shared task targeting hate speech identification. This work proposes a pre-trained transformer-based model to identify the target of hate speech, classifying it as directed toward an individual, organization, or community. We performed extensive experiments, exploring various machine learning (LR, SVM, and ensemble), deep learning (CNN, LSTM, CNN+BiLSTM), and transformer-based models (IndicBERT, mBERT, MuRIL, XLM-R) to identify hate speech. Experimental results indicate that the IndicBERT model achieved the highest performance among all other models, obtaining a macro F1-score of 0.6785, which placed the team 6th in the task.
The rapid spread of misinformation in the digital era presents critical challenges for fake news detection, especially in low-resource languages (LRLs) like Malayalam, which lack extensive datasets and pre-trained models for widely spoken languages. This gap in resources makes it harder to build robust systems for combating misinformation despite the significant societal and political consequences it can have. To address these challenges, this work proposes a transformer-based approach for Task 1 of the Fake News Detection in Dravidian Languages (DravidianLangTech@NAACL 2025), which focuses on classifying Malayalam social media texts as either original or fake. The experiments involved a range of ML techniques (Logistic Regression (LR), Support Vector Machines (SVM), and Decision Trees (DT)) and DL architectures (BiLSTM, BiLSTM-LSTM, and BiLSTM-CNN). This work also explored transformer-based models, including IndicBERT, MuRiL, XLM-RoBERTa, and Malayalam BERT. Among these, Malayalam BERT achieved the best performance, with the highest macro F1-score of 0.892, securing a rank of 3rd in the competition.
Misogyny memes are a form of online content that spreads harmful and damaging ideas about women. By combining images and text, they often aim to mock, disrespect, or insult women, sometimes overtly and other times in more subtle, insidious ways. Detecting Misogyny memes is crucial for fostering safer and more respectful online communities. While extensive research has been conducted on high-resource languages (HRLs) like English, low-resource languages (LRLs) such as Dravidian (e.g., Tamil and Malayalam) remain largely overlooked. The shared task on Misogyny Meme Detection, organized as part of DravidianLangTech@NAACL 2025, provided a platform to tackle the challenge of identifying misogynistic content in memes, specifically in Malayalam. We participated in the competition and adopted a multimodal approach to contribute to this effort. For image analysis, we employed a ResNet18 model to extract visual features, while for text analysis, we utilized the IndicBERT model. Our system achieved an impressive F1-score of 0.87, earning us the 3rd rank in the task.