Abstract
Existing popular text-to-speech technologies focus on large models requiring a large corpus of recorded speech to train. The resulting models are typically run on high-resource servers where users synthesise speech from a client device requiring constant connectivity. For speakers of low-resource languages living in remote areas, this approach does not work. Corpora are typically small and synthesis needs to run on an unconnected, battery or solar-powered edge device. In this paper, we demonstrate how knowledge transfer and adversarial training can be used to create efficient models capable of running on edge devices using a corpus of only several hours. We apply these concepts to create a voice synthesiser for te reo Māori (the indigenous language of Aotearoa New Zealand) for a non-speaking user and ‘ōlelo Hawaiʻi (the indigenous language of Hawaiʻi) for a legally blind user, thus creating the first high-quality text-to-speech tools for these endangered, central-eastern Polynesian languages capable of running on a low powered edge device.- Anthology ID:
- 2024.sigul-1.50
- Volume:
- Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Maite Melero, Sakriani Sakti, Claudia Soria
- Venues:
- SIGUL | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 421–426
- Language:
- URL:
- https://aclanthology.org/2024.sigul-1.50
- DOI:
- Cite (ACL):
- Tūreiti Keith. 2024. Work in Progress: Text-to-speech on Edge Devices for Te Reo Māori and ‘Ōlelo Hawaiʻi. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 421–426, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Work in Progress: Text-to-speech on Edge Devices for Te Reo Māori and ‘Ōlelo Hawaiʻi (Keith, SIGUL-WS 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.sigul-1.50.pdf