Anisia Popescu
2026
The Added Value of Metadata and Annotations: Evidence from Two Large-Scale, Naturalistic Corpus Studies
Anisia Popescu | Johanna Cronenberg | Ioana Vasilescu | Ioana Chitoran | Lori Lamel | Martine Adda-Decker
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Anisia Popescu | Johanna Cronenberg | Ioana Vasilescu | Ioana Chitoran | Lori Lamel | Martine Adda-Decker
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This paper presents two case studies that highlight both the challenges and benefits of working with large-scale, naturalistic phonetic data. Our aim is to encourage researchers not to shy away from phonetic data found “in the wild”, even when such data are messy, noisy, or incomplete – because they can yield robust, novel insights beyond the reach of controlled laboratory studies. We focus on challenges that are endemic to large corpora, including degraded audio quality, sparse or inconsistent annotations, and missing speaker metadata. By comparing two corpus-based studies that diverge in methodology and statistical design, we show how different approaches can mitigate these limitations while still extracting meaningful patterns.
2024
Using Speech Technology to Test Theories of Phonetic and Phonological Typology
Anisia Popescu | Lori Lamel | Ioana Vasilescu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Anisia Popescu | Lori Lamel | Ioana Vasilescu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The present paper uses speech technology derived tools and methodologies to test theories about phonetic typology. We specifically look at how the two-way laryngeal contrast (voiced /b, d, g, v, z/ vs. voiceless /p, t, k, f, s/ obstruents) is implemented in European Portuguese, a language that has been suggested to exhibit a different voicing system than its sister Romance languages, more similar to the one found for Germanic languages. A large European Portuguese corpus was force aligned using (1) different combinations of parallel Portuguese (original), Italian (Romance language) and German (Germanic language) acoustic phone models and letting an ASR system choose the best fitting one, and (2) pronunciation variants (/b, d, g, v, z/ produced as either [b, d, g, v, z] or [p, t, k, f, s]) for obstruent consonants. Results support previous accounts in the literature that European Portuguese is diverging from the traditional voicing system known for Romance language, towards a hybrid system where stops and fricatives are specified for different voicing features.
2023
Typological classification of European Portuguese fricatives: a cross-language forced alignment and pronunciation variants study
Anisia Popescu | Lori Lamel | Ioana Vasilescu
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)
Anisia Popescu | Lori Lamel | Ioana Vasilescu
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)
2016
Allophonie et position dans la syllabe: Indices acoustiques pour les consonnes laterales (Acoustics of syllable position allophony: The case of lateral consonants)
Anisia Popescu | Ioana Chitoran
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP
Anisia Popescu | Ioana Chitoran
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP
L‟article traite de la manifestation acoustique des consonnes latérales en anglais américain et en roumain en fonction de la position syllabique et de la complexité phonotactique. Nous avons considéré quatre types de mesures: valeurs formantiques, équations locus, ratio d‟intensité et présence/absence de relâchements. Notre but est, d‟une part, de classifier les allophones des deux langues considérées et d‟autre part de déterminer les indices acoustiques des gestes articulatoires des consonnes latérales. Les résultats indiquent des différences importantes entre les deux langues. On montre que la distribution des allophones n‟est pas binaire, mais graduée et que le statut du geste dorsal peut être considéré comme un marqueur de « degré de clarté ». On montre aussi que l‟allophonie dépend de la position syllabique mais pas forcément de la complexité syllabique.