Incorporating External Annotation to improve Named Entity Translation in NMT

Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, Alexander Waibel


Abstract
The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems. This study explores methods incorporating named entity recognition (NER) into NMT with the aim to improve named entity translation. It proposes an annotation method that integrates named entities and inside–outside–beginning (IOB) tagging into the neural network input with the use of source factors. Our experiments on English→German and English→ Chinese show that just by including different NE classes and IOB tagging, we can increase the BLEU score by around 1 point using the standard test set from WMT2019 and achieve up to 12% increase in NE translation rates over a strong baseline.
Anthology ID:
2020.eamt-1.6
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
45–51
Language:
URL:
https://aclanthology.org/2020.eamt-1.6
DOI:
Bibkey:
Cite (ACL):
Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, and Alexander Waibel. 2020. Incorporating External Annotation to improve Named Entity Translation in NMT. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 45–51, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Incorporating External Annotation to improve Named Entity Translation in NMT (Modrzejewski et al., EAMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.eamt-1.6.pdf