Abstract
Automatic speech recognition (ASR) technology is frequently proposed as a means of preservation and documentation of endangered languages, with promising results thus far. Among the endangered languages spoken today, a significant number exhibit complex morphology. The models employed in contemporary language documentation pipelines that utilize ASR, however, are predominantly based on isolating or inflectional languages, often from the Indo-European family. This raises a critical concern: building models exclusively on such languages may introduce a bias, resulting in better performance with simpler morphological structures. In this paper, we investigate the performance of modern ASR architectures on morphologically complex languages. Results indicate that modern ASR architectures appear less robust in managing high OOV rates for morphologically complex languages in terms of word error rate, while character error rates are consistently higher for isolating languages.- Anthology ID:
- 2024.findings-emnlp.166
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2953–2963
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.166/
- DOI:
- 10.18653/v1/2024.findings-emnlp.166
- Cite (ACL):
- Eric Le Ferrand, Zoey Liu, Antti Arppe, and Emily Prud’hommeaux. 2024. Are modern neural ASR architectures robust for polysynthetic languages?. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2953–2963, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Are modern neural ASR architectures robust for polysynthetic languages? (Le Ferrand et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.166.pdf