Jiacheng Zhang


2020

pdf bib
THUMT: An Open-Source Toolkit for Neural Machine Translation
Zhixing Tan | Jiacheng Zhang | Xuancheng Huang | Gang Chen | Shuo Wang | Maosong Sun | Huanbo Luan | Yang Liu
Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

2018

pdf bib
Improving the Transformer Translation Model with Document-Level Context
Jiacheng Zhang | Huanbo Luan | Maosong Sun | Feifei Zhai | Jingfang Xu | Min Zhang | Yang Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly.

2017

pdf bib
Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization
Jiacheng Zhang | Yang Liu | Huanbo Luan | Jingfang Xu | Maosong Sun
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge. In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. We represent prior knowledge sources as features in a log-linear model, which guides the learning processing of the neural translation model. Experiments on Chinese-English dataset show that our approach leads to significant improvements.