Producing Unseen Morphological Variants in Statistical Machine Translation
Matthias Huck, Aleš Tamchyna, Ondřej Bojar, Alexander Fraser
Abstract
Translating into morphologically rich languages is difficult. Although the coverage of lemmas may be reasonable, many morphological variants cannot be learned from the training data. We present a statistical translation system that is able to produce these inflected word forms. Different from most previous work, we do not separate morphological prediction from lexical choice into two consecutive steps. Our approach is novel in that it is integrated in decoding and takes advantage of context information from both the source language and the target language sides.- Anthology ID:
- E17-2059
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Mirella Lapata, Phil Blunsom, Alexander Koller
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 369–375
- Language:
- URL:
- https://aclanthology.org/E17-2059
- DOI:
- Cite (ACL):
- Matthias Huck, Aleš Tamchyna, Ondřej Bojar, and Alexander Fraser. 2017. Producing Unseen Morphological Variants in Statistical Machine Translation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 369–375, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Producing Unseen Morphological Variants in Statistical Machine Translation (Huck et al., EACL 2017)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/E17-2059.pdf