Abstract
Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling–pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.- Anthology ID:
- W17-5403
- Volume:
- Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Venue:
- WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 19–26
- Language:
- URL:
- https://aclanthology.org/W17-5403
- DOI:
- 10.18653/v1/W17-5403
- Cite (ACL):
- Ben Peters, Jon Dehdari, and Josef van Genabith. 2017. Massively Multilingual Neural Grapheme-to-Phoneme Conversion. In Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems, pages 19–26, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Massively Multilingual Neural Grapheme-to-Phoneme Conversion (Peters et al., 2017)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/W17-5403.pdf
- Code
- bpopeters/mg2p