Miguel Couceiro


2024

pdf
The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
Ajinkya Kulkarni | Anna Tokareva | Rameez Qureshi | Miguel Couceiro
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion

In the field of spoken language understanding, systems like Whisper and Multilingual Massive Speech (MMS) have shown state-of-the-art performances. This study is dedicated to a comprehensive exploration of the Whisper and MMS systems, with a focus on assessing biases in automatic speech recognition (ASR) inherent to casual conversation speech specific to the Portuguese language. Our investigation encompasses various categories, including gender, age, skin tone color, and geo-location. Alongside traditional ASR evaluation metrics such as Word Error Rate (WER), we have incorporated p-value statistical significance for gender bias analysis. Furthermore, we extensively examine the impact of data distribution and empirically show that oversampling techniques alleviate such stereotypical biases. This research represents a pioneering effort in quantifying biases in the Portuguese language context through the application of MMS and Whisper, contributing to a better understanding of ASR systems’ performance in multilingual settings.

2021

pdf
A New Broad NLP Training from Speech to Knowledge
Maxime Amblard | Miguel Couceiro
Proceedings of the Fifth Workshop on Teaching NLP

In 2018, the Master Sc. in NLP opened at IDMC - Institut des Sciences du Digital, du Management et de la Cognition, Université de Lorraine - Nancy, France. Far from being a creation ex-nihilo, it is the product of a history and many reflections on the field and its teaching. This article proposes epistemological and critical elements on the opening and maintainance of this so far new master’s program in NLP.

pdf
GECko+: a Grammatical and Discourse Error Correction Tool
Eduardo Calò | Léo Jacqmin | Thibo Rosemplatt | Maxime Amblard | Miguel Couceiro | Ajinkya Kulkarni
Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 3 : Démonstrations

GECko+ : a Grammatical and Discourse Error Correction Tool We introduce GECko+, a web-based writing assistance tool for English that corrects errors both at the sentence and at the discourse level. It is based on two state-of-the-art models for grammar error correction and sentence ordering. GECko+ is available online as a web application that implements a pipeline combining the two models.