Supriya Chanda


2026

Over the past decade, the rapid advancement of LLMs has significantly improved natural language generation. However, these models often inherit and amplify gender biases present in large-scale training data, leading to stereotypical associations, androcentric language, and misgendering. Such biases can negatively impact applications in education, healthcare, legal systems, and automated content generation. In this paper, we address this issue as defined in the shared task LT-EDI on Gender-Inclusive Language Generation. The task focuses on rewriting gender-biased sentences into inclusive, gender-neutral alternatives while preserving meaning. We propose a retrieval-augmented framework combining lexical replacement, semantic retrieval, and controlled instruction-tuned generation. An edit-distance constraint and self-evaluation step ensure minimal, coherent, and bias-free outputs. We also present zero-shot adaptation for low resource language. The implementation code available here https://github.com/SupriyaChanda/gilg-ltedi-acl2026.git.

2020

This paper reports our submission to the shared Task 2: Identification of informative COVID-19 English tweets at W-NUT 2020. We attempted a few techniques, and we briefly explain here two models that showed promising results in tweet classification tasks: DistilBERT and FastText. DistilBERT achieves a F1 score of 0.7508 on the test set, which is the best of our submissions.
This paper describes the IRlab@IIT-BHU system for the OffensEval 2020. We take the SVM with TF-IDF features to identify and categorize hate speech and offensive language in social media for two languages. In subtask A, we used a linear SVM classifier to detect abusive content in tweets, achieving a macro F1 score of 0.779 and 0.718 for Arabic and Greek, respectively.