Daniel Menendez
2025
Text-to-speech system for low-resource languages: A case study in Shipibo-Konibo (a Panoan language from Peru)
Daniel Menendez
|
Hector Gomez
Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
This paper presents the design and development of a Text-to-Speech (TTS) model for Shipibo-Konibo, a low-resource indigenous language spoken mainly in the Peruvian Amazon. Despite the challenge posed by the scarcity of data, the model was trained with over 4 hours of recordings and 3,025 meticulously collected written sentences. The tests results demon strated an intelligibility rate (IR) exceeding 88% and a mean opinion score (MOS) of 4.01, confirming the quality of the audio generated by the model, which comprises the Tacotron 2 spectrogram predictor and the HiFi-GAN vocoder. Furthermore, the potential of this model to be trained in other indigenous languages spoken in Peru is highlighted, opening a promising avenue for the documentation and revitalization of these languages.