Ashima Suvarna


2020

pdf
#NotAWhore! A Computational Linguistic Perspective of Rape Culture and Victimization on Social Media
Ashima Suvarna | Grusha Bhalla
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The recent surge in online forums and movements supporting sexual assault survivors has led to the emergence of a ‘virtual bubble’ where survivors can recount their stories. However, this also makes the survivors vulnerable to bullying, trolling and victim blaming. Specifically, victim blaming has been shown to have acute psychological effects on the survivors and further discourage formal reporting of such crimes. Therefore, it is important to devise computationally relevant methods to identify and prevent victim blaming to protect the victims. In our work, we discuss the drastic effects of victim blaming through a short case study and then propose a single step transfer-learning based classification method to identify victim blaming language on Twitter. Finally, we compare the performance of our proposed model against various deep learning and machine learning models on a manually annotated domain-specific dataset.

pdf
Evaluating the Impact of Sub-word Information and Cross-lingual Word Embeddings on Mi’kmaq Language Modelling
Jeremie Boudreau | Akankshya Patra | Ashima Suvarna | Paul Cook
Proceedings of the Twelfth Language Resources and Evaluation Conference

Mi’kmaq is an Indigenous language spoken primarily in Eastern Canada. It is polysynthetic and low-resource. In this paper we consider a range of n-gram and RNN language models for Mi’kmaq. We find that an RNN language model, initialized with pre-trained fastText embeddings, performs best, highlighting the importance of sub-word information for Mi’kmaq language modelling. We further consider approaches to language modelling that incorporate cross-lingual word embeddings, but do not see improvements with these models. Finally we consider language models that operate over segmentations produced by SentencePiece — which include sub-word units as tokens — as opposed to word-level models. We see improvements for this approach over word-level language models, again indicating that sub-word modelling is important for Mi’kmaq language modelling.