Michele Donini


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2023

pdf bib
Geographical Erasure in Language Generation
Pola Schwöbel | Jacek Golebiowski | Michele Donini | Cedric Archambeau | Danish Pruthi
Findings of the Association for Computational Linguistics: EMNLP 2023

Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can propagate into generated language. In this work, we study and operationalise a form of geographical erasure wherein language models underpredict certain countries. We demonstrate consistent instances of erasure across a range of LLMs. We discover that erasure strongly correlates with low frequencies of country mentions in the training corpus. Lastly, we mitigate erasure by finetuning using a custom objective.

2021

pdf bib
On the Lack of Robust Interpretability of Neural Text Classifiers
Muhammad Bilal Zafar | Michele Donini | Dylan Slack | Cedric Archambeau | Sanjiv Das | Krishnaram Kenthapadi
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021