Modeling Word Formation in English–German Neural Machine Translation

Marion Weller-Di Marco, Alexander Fraser


Abstract
This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology. Our linguistically sound segmentation is combined with a method for target-side inflection to accommodate modeling word formation. The best system variants employ source-side morphological analysis and model complex target-side words, improving over a standard system.
Anthology ID:
2020.acl-main.389
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4227–4232
Language:
URL:
https://aclanthology.org/2020.acl-main.389
DOI:
10.18653/v1/2020.acl-main.389
Bibkey:
Cite (ACL):
Marion Weller-Di Marco and Alexander Fraser. 2020. Modeling Word Formation in English–German Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4227–4232, Online. Association for Computational Linguistics.
Cite (Informal):
Modeling Word Formation in English–German Neural Machine Translation (Weller-Di Marco & Fraser, ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.acl-main.389.pdf
Video:
 http://slideslive.com/38929104