Abstract
This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology. Our linguistically sound segmentation is combined with a method for target-side inflection to accommodate modeling word formation. The best system variants employ source-side morphological analysis and model complex target-side words, improving over a standard system.- Anthology ID:
- 2020.acl-main.389
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4227–4232
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.389
- DOI:
- 10.18653/v1/2020.acl-main.389
- Cite (ACL):
- Marion Weller-Di Marco and Alexander Fraser. 2020. Modeling Word Formation in English–German Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4227–4232, Online. Association for Computational Linguistics.
- Cite (Informal):
- Modeling Word Formation in English–German Neural Machine Translation (Weller-Di Marco & Fraser, ACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.acl-main.389.pdf