This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
BillodalRoy
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
While large language models show promise as AI tutors, evaluating their pedagogical capabilities remains challenging. In this paper, we, team LexiLogic presents our participation in the BEA 2025 shared task on evaluating AI tutors across five dimensions: Mistake Identification, Mistake Location, Providing Guidance, Actionability, and Tutor Identification. We approach all tracks as classification tasks using fine-tuned transformer models on a dataset of 300 educational dialogues between a student and a tutor in the mathematical domain. Our results show varying performance across tracks, with macro average F1 scores ranging from 0.47 to 0.82, achieving rankings between 4th and 31st place. Such models have the potential to be used in developing automated scoring metrics for assessing the pedagogical skills of AI math tutors.
Code-switched generation is an emerging application in NLP systems, as code-switched text and speech are common and natural forms of conversation in multilingual communities worldwide. While monolingual generation has matured significantly with advances in large language models, code-switched generation still remains challenging, especially for languages and domains with less representation in pre-training datasets. In this paper, we describe our submission to the shared task of predicting human preferences for code-switched text in English-Malayalam, English-Tamil, and English-Hindi. We discuss our various approaches and report on the accuracy scores for each approach.
Social media platforms have become a significant medium for communication and expression, but they are also plagued by misogynistic content targeting women. This study focuses on detecting misogyny in memes and abusive textual content in Tamil and Malayalam languages, which are underrepresented in natural language processing research. Leveraging advanced machine learning and deep learning techniques, we developed a system capable of identifying misogynistic memes and abusive text. By addressing cultural and linguistic nuances, our approach enhances detection accuracy and contributes to safer online spaces for women. This work also serves as a foundation for expanding misogyny detection to other low-resource languages, fostering inclusivity and combating online abuse effectively.This paper presents our work on detecting misogynistic memes and abusive Tamil and Malayalam text targeting women on social media platforms. Leveraging the pretrained models l3cube-pune/tamil-bert and l3cube-pune/malayalam-bert, we explored various data cleaning and augmentation strategies to enhance detection performance. The models were fine-tuned on curated datasets and evaluated using accuracy, F1-score, precision, and recall. The results demonstrated significant improvements with our cleaning and augmentation techniques, yielding robust performance in detecting nuanced and culturally-specific abusive content.Our model achieved macro F1 scores of 77.83/78.24 on L3Cube-Bert-Tamil and 78.16/77.01 on L3Cube-Bert-Malayalam, ranking 3rd and 4th on the leaderboard. For the misogyny task, we obtained 83.58/82.94 on L3Cube-Bert-Malayalam and 73.16/73.8 on L3Cube-Bert-Tamil, placing 9th in both. These results highlight our model’s effectiveness in low-resource language classification.
Fake news and hard-to-detect AI-generated content are pressing issues in online media, which are expected to exacerbate due to the recent advances in generative AI. Moreover, tools to keep such content under check are less accurate for languages with less available online data. In this paper, we describe our submissions to two shared tasks at the NAACL Dravidian Language Tech workshop, namely detecting fake news in Malayalam and detecting AI-generated product reviews in Malayalam and Tamil. We obtained test macro F1 scores of 0.29 and 0.82 in the multi-class and binary classification sub-tasks within the Malayalam fake news task, and test macro F1 scores of 0.9 and 0.646 in the task of detecting AI-generated product reviews in Malayalam and Tamil respectively.
This paper describes our participation in the DravidianLangTech@NAACL 2025 shared task on hate speech detection in Dravidian languages. While the task provided both text transcripts and audio data, we demonstrate that competitive results can be achieved using text features alone. We employed fine-tuned Bidirectional Encoder Representations from Transformers (BERT) models from l3cube-pune for Malayalam, Tamil, and Telugu languages. Our system achieved notable results, securing second position for Tamil and Malayalam tasks, and first position for Telugu in the official leaderboard.
We present our approach and findings for two sentiment analysis shared tasks as part of DravidianLangTech@NAACL 2025. The first task involved a seven-class political sentiment classification for Tamil tweets, while the second addressed code-mixed sentiment analysis in Tamil-English and Tulu-English social media texts. We employed language-specific BERT models fine-tuned on the respective tasks, specifically utilizing the L3Cube-Tamil-BERT for Tamil classification and a Telugu-based BERT model for Tulu classification. Our system achieved notable results, particularly securing the first position in the Tulu code-mixed sentiment analysis track. The experiments demonstrate the effectiveness of language-specific pre-trained models for Dravidian language sentiment analysis, while also highlighting the challenges in handling political discourse and code-mixed content.