Tim Cawley


2025

pdf bib
English-based acoustic models perform well in the forced alignment of two English-based Pacific Creoles
Sam Passmore | Lila San Roque | Kirsty Gillespie | Saurabh Kumar Nath | Kira Davey | Keira Mullan | Tim Cawley | Jennifer Biggs | Rosey Billington | Bethwyn Evans | Nick Thieberger | Nicholas Evans | Danielle Barth
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Expanding the breadth languages used to study sociophonetic variation and change is an important step in the theoretical development of sociophonetics. As data archives grow, forced alignment can accelerate the study of sociophonetic variation in minority languages. This paper examines the application of English and custom-made acoustic models on the alignment of vowels in two Pacific Creoles, Tok Pisin (59 hours) and Bislama (38.5 hours). We find that English models perform acceptably well in both languages, and as well as humans in vowel environments described as ‘Highly Reliable’. Custom models performed better in Bislama than Tok Pisin. We end the paper with recommendations on the use of cross-linguistic acoustic models in the case of English-Based Creoles.

pdf bib
Understanding Multilingual ASR Systems: The Role of Language Families and Typological Features in Seamless and Whisper
Simon Gonzalez | Tao Hoang | Maria Myung-Hee Kim | Bradley Donnelly | Jennifer Biggs | Tim Cawley
Proceedings of The 23rd Annual Workshop of the Australasian Language Technology Association

This study investigates the extent to which linguistic typology influences the performance of two automatic speech recognition (ASR) systems across diverse language families. Using the FLEURS corpus and typological features from the World Atlas of Language Structures (WALS), we analysed 40 languages grouped by phonological, morphological, syntactic, and semantic domains. We evaluated two state-of-the-art multilingual ASR systems, Whisper and Seamless, to examine how their performance, measured by word error rate (WER), correlates with linguistic structures. Random Forests and Mixed Effects Models were used to quantify feature impact and statistical significance. Results reveal that while both systems leverage typological patterns, they differ in their sensitivity to specific domains. Our findings highlight how structural and functional linguistic features shape ASR performance, offering insights into model generalisability and typology-aware system development.