String Transduction with Target Language Models and Insertion Handling
Abstract
Many character-level tasks can be framed as sequence-to-sequence transduction, where the target is a word from a natural language. We show that leveraging target language models derived from unannotated target corpora, combined with a precise alignment of the training data, yields state-of-the art results on cognate projection, inflection generation, and phoneme-to-grapheme conversion.- Anthology ID:
- W18-5805
- Volume:
- Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology
- Month:
- October
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Sandra Kuebler, Garrett Nicolai
- Venue:
- EMNLP
- SIG:
- SIGMORPHON
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 43–53
- Language:
- URL:
- https://aclanthology.org/W18-5805
- DOI:
- 10.18653/v1/W18-5805
- Cite (ACL):
- Garrett Nicolai, Saeed Najafi, and Grzegorz Kondrak. 2018. String Transduction with Target Language Models and Insertion Handling. In Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 43–53, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- String Transduction with Target Language Models and Insertion Handling (Nicolai et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/teach-a-man-to-fish/W18-5805.pdf