Sarah Jane Delany


2023

pdf
Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection
Nasim Sobhani | Kinshuk Sengupta | Sarah Jane Delany
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Predictions from machine learning models can reflect bias in the data on which they are trained. Gender bias has been shown to be prevalent in natural language processing models. The research into identifying and mitigating gender bias in these models predominantly considers gender as binary, male and female, neglecting the fluidity and continuity of gender as a variable. In this paper, we present an approach to evaluate gender bias in a prediction task, which recognises the non-binary nature of gender. We gender-neutralise a random subset of existing real-world hate speech data. We extend the existing template approach for measuring gender bias to include test examples that are gender-neutral. Measuring the bias across a selection of hate speech datasets we show that the bias for the gender-neutral data is closer to that seen for test instances that identify as male than those that identify as female.

2021

pdf
Interactive Learning Approach for Arabic Target-Based Sentiment Analysis
Husamelddin Balla | Marisa Llorens Salvador | Sarah Jane Delany
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Recently, the majority of sentiment analysis researchers focus on target-based sentiment analysis because it delivers in-depth analysis with more accurate results as compared to traditional sentiment analysis. In this paper, we propose an interactive learning approach to tackle a target-based sentiment analysis task for the Arabic language. The proposed IA-LSTM model uses an interactive attention-based mechanism to force the model to focus on different parts (targets) of a sentence. We investigate the ability to use targets, right, and left context, and model them separately to learn their own representations via interactive modeling. We evaluated our model on two different datasets: Arabic hotel review and Arabic book review datasets. The results demonstrate the effectiveness of using this interactive modeling technique for the Arabic target-based task. The model obtained accuracy values of 83.10 compared to SOTA models such as AB-LSTM-PC which obtained 82.60 for the same dataset.