Boosting Neural Machine Translation with Similar Translations

Jitao Xu, Josep Crego, Jean Senellart


Abstract
This presentation demonstrates data augmentation methods for Neural Machine Translation to make use of similar translations, in a comparable way a human translator employs fuzzy matches. We show how we simply feed the neural model with information on both source and target sides of the fuzzy matches, and we also extend the similarity to include semantically related translations retrieved using distributed sentence representations. We show that translations based on fuzzy matching provide the model with “copy” information while translations based on embedding similarities tend to extend the translation “context”. Results indicate that the effect from both similar sentences are adding up to further boost accuracy, are combining naturally with model fine-tuning and are providing dynamic adaptation for unseen translation pairs. Tests on multiple data sets and domains show consistent accuracy improvements.
Anthology ID:
2022.amta-upg.20
Volume:
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
282–292
Language:
URL:
https://aclanthology.org/2022.amta-upg.20
DOI:
Bibkey:
Cite (ACL):
Jitao Xu, Josep Crego, and Jean Senellart. 2022. Boosting Neural Machine Translation with Similar Translations. In Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track), pages 282–292, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Boosting Neural Machine Translation with Similar Translations (Xu et al., AMTA 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.amta-upg.20.pdf
Presentation:
 2022.amta-upg.20.Presentation.pdf