Qiu Ran
2021
WeChat Neural Machine Translation Systems for WMT21
Xianfeng Zeng
|
Yijin Liu
|
Ernan Li
|
Qiu Ran
|
Fandong Meng
|
Peng Li
|
Jinan Xu
|
Jie Zhou
Proceedings of the Sixth Conference on Machine Translation
This paper introduces WeChat AI’s participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German. Our systems are based on the Transformer (Vaswani et al., 2017) with several novel and effective variants. In our experiments, we employ data filtering, large-scale synthetic data generation (i.e., back-translation, knowledge distillation, forward-translation, iterative in-domain knowledge transfer), advanced finetuning approaches, and boosted Self-BLEU based model ensemble. Our constrained systems achieve 36.9, 46.9, 27.8 and 31.3 case-sensitive BLEU scores on English->Chinese, English->Japanese, Japanese->English and English->German, respectively. The BLEU scores of English->Chinese, English->Japanese and Japanese->English are the highest among all submissions, and that of English->German is the highest among all constrained submissions.
2020
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
Qiu Ran
|
Yankai Lin
|
Peng Li
|
Jie Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments. The segments are generated simultaneously while each segment is predicted token-by-token. By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors. Experimental results on three widely-used benchmark datasets show that our proposed model achieves more than 4 times speedup while maintaining comparable performance compared with the corresponding autoregressive model.
2019
NumNet: Machine Reading Comprehension with Numerical Reasoning
Qiu Ran
|
Yankai Lin
|
Peng Li
|
Jie Zhou
|
Zhiyuan Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human’s reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems. To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs numerical reasoning over numbers in the question and passage. Our system achieves an EM-score of 64.56% on the DROP dataset, outperforming all existing machine reading comprehension models by considering the numerical relations among numbers.
Search
Co-authors
- Peng Li 3
- Jie Zhou 3
- Yankai Lin 2
- Zhiyuan Liu 1
- Xianfeng Zeng 1
- show all...