Yafu Li


2022

pdf
Prompt-Driven Neural Machine Translation
Yafu Li | Yongjing Yin | Jing Li | Yue Zhang
Findings of the Association for Computational Linguistics: ACL 2022

Neural machine translation (NMT) has obtained significant performance improvement over the recent years. However, NMT models still face various challenges including fragility and lack of style flexibility. Moreover, current methods for instance-level constraints are limited in that they are either constraint-specific or model-specific. To this end, we propose prompt-driven neural machine translation to incorporate prompts for enhancing translation control and enriching flexibility. Empirical results demonstrate the effectiveness of our method in both prompt responding and translation quality. Through human evaluation, we further show the flexibility of prompt control and the efficiency in human-in-the-loop translation.

pdf
Multi-Granularity Optimization for Non-Autoregressive Translation
Yafu Li | Leyang Cui | Yongjing Yin | Yue Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Despite low latency, non-autoregressive machine translation (NAT) suffers severe performance deterioration due to the naive independence assumption. This assumption is further strengthened by cross-entropy loss, which encourages a strict match between the hypothesis and the reference token by token. To alleviate this issue, we propose multi-granularity optimization for NAT, which collects model behaviours on translation segments of various granularities and integrates feedback for backpropagation. Experiments on four WMT benchmarks show that the proposed method significantly outperforms the baseline models trained with cross-entropy loss, and achieves the best performance on WMT’16 En⇔Ro and highly competitive results on WMT’14 En⇔De for fully non-autoregressive translation.

pdf
Categorizing Semantic Representations for Neural Machine Translation
Yongjing Yin | Yafu Li | Fandong Meng | Jie Zhou | Yue Zhang
Proceedings of the 29th International Conference on Computational Linguistics

Modern neural machine translation (NMT) models have achieved competitive performance in standard benchmarks. However, they have recently been shown to suffer limitation in compositional generalization, failing to effectively learn the translation of atoms (e.g., words) and their semantic composition (e.g., modification) from seen compounds (e.g., phrases), and thus suffering from significantly weakened translation performance on unseen compounds during inference.We address this issue by introducing categorization to the source contextualized representations. The main idea is to enhance generalization by reducing sparsity and overfitting, which is achieved by finding prototypes of token representations over the training set and integrating their embeddings into the source encoding. Experiments on a dedicated MT dataset (i.e., CoGnition) show that our method reduces compositional generalization error rates by 24% error reduction. In addition, our conceptually simple method gives consistently better results than the Transformer baseline on a range of general MT datasets.

2021

pdf
On Compositional Generalization of Neural Machine Translation
Yafu Li | Yongjing Yin | Yulong Chen | Yue Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Modern neural machine translation (NMT) models have achieved competitive performance in standard benchmarks such as WMT. However, there still exist significant issues such as robustness, domain generalization, etc. In this paper, we study NMT models from the perspective of compositional generalization by building a benchmark dataset, CoGnition, consisting of 216k clean and consistent sentence pairs. We quantitatively analyze effects of various factors using compound translation error rate, then demonstrate that the NMT model fails badly on compositional generalization, although it performs remarkably well under traditional metrics.