Sagnik Basu


2025

pdf bib
Navigating the Cultural Kaleidoscope: A Hitchhiker’s Guide to Sensitivity in Large Language Models
Somnath Banerjee | Sayan Layek | Hari Shrawgi | Rajarshi Mandal | Avik Halder | Shanu Kumar | Sagnik Basu | Parag Agrawal | Rima Hazra | Animesh Mukherjee
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Cultural harm stems in LLMs whereby these models fail to align with specific cultural norms, resulting in misrepresentations or violations of cultural values. This work addresses the challenges of ensuring cultural sensitivity in LLMs, especially in small-parameter models that often lack the extensive training data needed to capture global cultural nuances. We present two key contributions: (1) A cultural harm test dataset, created to assess model outputs across different cultural contexts through scenarios that expose potential cultural insensitivities, and (2) A culturally aligned preference dataset, aimed at restoring cultural sensitivity through fine-tuning based on feedback from diverse annotators. These datasets facilitate the evaluation and enhancement of LLMs, ensuring their ethical and safe deployment across different cultural landscapes. Our results show that integrating culturally aligned feedback leads to a marked improvement in model behavior, significantly reducing the likelihood of generating culturally insensitive or harmful content.

2021

pdf bib
How vulnerable are you? A Novel Computational Psycholinguistic Analysis for Phishing Influence Detection
Anik Chatterjee | Sagnik Basu
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

This document contains our work and progress regarding phishing detection by searching for proper influential sentences. Currently, the world is becoming smart, as a result most of the transactions and posting offers happen online. So, human beings have become the most vulnerable to security breach or hacking through phishing attacks, or being persuaded through influential texts in social media sites. We have analyzed influential and non-influential sentences and populated our dataset with those. We have proposed a computational model for implementing Cialdini and we got state of the art accuracy with our model. Our approach is language independent and domain independent and it is applicable to any problem where persuation detection is important. Our dataset and proposed computational psycholinguistic approach will motivate researchers to work more in the area of persuasion detection.