Code-Switching for Enhancing NMT with Pre-Specified Translation

Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, Min Zhang


Abstract
Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words.
Anthology ID:
N19-1044
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
449–459
Language:
URL:
https://aclanthology.org/N19-1044
DOI:
10.18653/v1/N19-1044
Bibkey:
Cite (ACL):
Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, and Min Zhang. 2019. Code-Switching for Enhancing NMT with Pre-Specified Translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 449–459, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Code-Switching for Enhancing NMT with Pre-Specified Translation (Song et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/N19-1044.pdf
Code
 batman2013/e-commerce_test_sets