Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech
Adriana Guevara-Rukoz, Isin Demirsahin, Fei He, Shan-Hui Cathy Chu, Supheakmungkol Sarin, Knot Pipatsrisawat, Alexander Gutkin, Alena Butryna, Oddur Kjartansson
Abstract
In this paper we present a multidialectal corpus approach for building a text-to-speech voice for a new dialect in a language with existing resources, focusing on various South American dialects of Spanish. We first present public speech datasets for Argentinian, Chilean, Colombian, Peruvian, Puerto Rican and Venezuelan Spanish specifically constructed with text-to-speech applications in mind using crowd-sourcing. We then compare the monodialectal voices built with minimal data to a multidialectal model built by pooling all the resources from all dialects. Our results show that the multidialectal model outperforms the monodialectal baseline models. We also experiment with a “zero-resource” dialect scenario where we build a multidialectal voice for a dialect while holding out target dialect recordings from the training data.- Anthology ID:
- 2020.lrec-1.801
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 6504–6513
- Language:
- English
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.lrec-1.801/
- DOI:
- Cite (ACL):
- Adriana Guevara-Rukoz, Isin Demirsahin, Fei He, Shan-Hui Cathy Chu, Supheakmungkol Sarin, Knot Pipatsrisawat, Alexander Gutkin, Alena Butryna, and Oddur Kjartansson. 2020. Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 6504–6513, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech (Guevara-Rukoz et al., LREC 2020)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.lrec-1.801.pdf