Pranav Narayanan Venkit

Also published as: Pranav Narayanan Venkit, Pranav Venkit


2023

pdf
Nationality Bias in Text Generation
Pranav Narayanan Venkit | Sanjana Gautam | Ruchi Panchanadikar | Ting-Hao Huang | Shomir Wilson
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Little attention is placed on analyzing nationality bias in language models, especially when nationality is highly used as a factor in increasing the performance of social NLP models. This paper examines how a text generation model, GPT-2, accentuates pre-existing societal biases about country-based demonyms. We generate stories using GPT-2 for various nationalities and use sensitivity analysis to explore how the number of internet users and the country’s economic status impacts the sentiment of the stories. To reduce the propagation of biases through large language models (LLM), we explore the debiasing method of adversarial triggering. Our results show that GPT-2 demonstrates significant bias against countries with lower internet users, and adversarial triggering effectively reduces the same.

pdf
The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis
Pranav Venkit | Mukund Srinath | Sanjana Gautam | Saranya Venkatraman | Vipul Gupta | Rebecca Passonneau | Shomir Wilson
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We conduct an inquiry into the sociotechnical aspects of sentiment analysis (SA) by critically examining 189 peer-reviewed papers on their applications, models, and datasets. Our investigation stems from the recognition that SA has become an integral component of diverse sociotechnical systems, exerting influence on both social and technical users. By delving into sociological and technological literature on sentiment, we unveil distinct conceptualizations of this term in domains such as finance, government, and medicine. Our study exposes a lack of explicit definitions and frameworks for characterizing sentiment, resulting in potential challenges and biases. To tackle this issue, we propose an ethics sheet encompassing critical inquiries to guide practitioners in ensuring equitable utilization of SA. Our findings underscore the significance of adopting an interdisciplinary approach to defining sentiment in SA and offer a pragmatic solution for its implementation.

pdf
Automated Ableism: An Exploration of Explicit Disability Biases in Sentiment and Toxicity Analysis Models
Pranav Narayanan Venkit | Mukund Srinath | Shomir Wilson
Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023)

We analyze sentiment analysis and toxicity detection models to detect the presence of explicit bias against people with disability (PWD). We employ the bias identification framework of Perturbation Sensitivity Analysis to examine conversations related to PWD on social media platforms, specifically Twitter and Reddit, in order to gain insight into how disability bias is disseminated in real-world social settings. We then create the Bias Identification Test in Sentiment (BITS) corpus to quantify explicit disability bias in any sentiment analysis and toxicity detection models. Our study utilizes BITS to uncover significant biases in four open AIaaS (AI as a Service) sentiment analysis tools, namely TextBlob, VADER, Google Cloud Natural Language API, DistilBERT and two toxicity detection models, namely two versions of Toxic-BERT. Our findings indicate that all of these models exhibit statistically significant explicit bias against PWD.

2022

pdf
A Study of Implicit Bias in Pretrained Language Models against People with Disabilities
Pranav Narayanan Venkit | Mukund Srinath | Shomir Wilson
Proceedings of the 29th International Conference on Computational Linguistics

Pretrained language models (PLMs) have been shown to exhibit sociodemographic biases, such as against gender and race, raising concerns of downstream biases in language technologies. However, PLMs’ biases against people with disabilities (PWDs) have received little attention, in spite of their potential to cause similar harms. Using perturbation sensitivity analysis, we test an assortment of popular word embedding-based and transformer-based PLMs and show significant biases against PWDs in all of them. The results demonstrate how models trained on large corpora widely favor ableist language.