Multi-Reward based Reinforcement Learning for Neural Machine Translation

Shuo Sun; Hongxu Hou; Nier Wu; Ziyue Guo; Chaowei Zhang

Multi-Reward based Reinforcement Learning for Neural Machine Translation

Shuo Sun, Hongxu Hou, Nier Wu, Ziyue Guo, Chaowei Zhang

Abstract

Reinforcement learning (RL) has made remarkable progress in neural machine translation (NMT). However, it exists the problems with uneven sampling distribution, sparse rewards and high variance in training phase. Therefore, we propose a multi-reward reinforcement learning training strategy to decouple action selection and value estimation. Meanwhile, our method combines with language model rewards to jointly optimize model parameters. In addition, we add Gumbel noise in sampling to obtain more effective semantic information. To verify the robustness of our method, we not only conducted experiments on large corpora, but also performed on low-resource languages. Experimental results show that our work is superior to the baselines in WMT14 English-German, LDC2014 Chinese-English and CWMT2018 Mongolian-Chinese tasks, which fully certificates the effectiveness of our method.

Anthology ID:: 2020.ccl-1.91
Volume:: Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:: October
Year:: 2020
Address:: Haikou, China
Editors:: Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 984–993
Language:: English
URL:: https://aclanthology.org/2020.ccl-1.91
DOI:
Bibkey:
Cite (ACL):: Shuo Sun, Hongxu Hou, Nier Wu, Ziyue Guo, and Chaowei Zhang. 2020. Multi-Reward based Reinforcement Learning for Neural Machine Translation. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 984–993, Haikou, China. Chinese Information Processing Society of China.
Cite (Informal):: Multi-Reward based Reinforcement Learning for Neural Machine Translation (Sun et al., CCL 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp-22-attachments/2020.ccl-1.91.pdf

PDF Search