Multilingual MFA: Forced Alignment on Low-Resource Related Languages

Alessio Tosolini, Claire Bowern


Abstract
We compare the outcomes of multilingual and crosslingual training for related and unrelated Australian languages with similar phonologi- cal inventories. We use the Montreal Forced Aligner to train acoustic models from scratch and adapt a large English model, evaluating results against seen data, unseen data (seen lan- guage), and unseen data and language. Results indicate benefits of adapting the English base- line model for previously unseen languages.
Anthology ID:
2025.computel-main.11
Volume:
Proceedings of the Eight Workshop on the Use of Computational Methods in the Study of Endangered Languages
Month:
March
Year:
2025
Address:
Honolulu, Hawaii, USA
Editors:
Jordan Lachler, Godfred Agyapong, Antti Arppe, Sarah Moeller, Aditi Chaudhary, Shruti Rijhwani, Daisy Rosenblum
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
100–109
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.computel-main.11/
DOI:
Bibkey:
Cite (ACL):
Alessio Tosolini and Claire Bowern. 2025. Multilingual MFA: Forced Alignment on Low-Resource Related Languages. In Proceedings of the Eight Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 100–109, Honolulu, Hawaii, USA. Association for Computational Linguistics.
Cite (Informal):
Multilingual MFA: Forced Alignment on Low-Resource Related Languages (Tosolini & Bowern, ComputEL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.computel-main.11.pdf