Zhejian Lai (赖哲剑) - ACL Anthology

Zhejian Lai

Also published as: 哲剑赖

2025

pdf bib abs
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Xiang Geng | Zhejian Lai | Jiajun Chen | Hao Yang | Shujian Huang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Quality Estimation (QE) models evaluate the quality of machine translations without reference translations, serving as the reward models for the translation task.Due to the data scarcity, synthetic data generation has emerged as a promising solution.However, synthetic QE data often suffers from distribution shift, which can manifest as discrepancies between pseudo and real translations, or in pseudo labels that do not align with human preferences.To tackle this issue, we introduce DCSQE, a novel framework for alleviating distribution shift in synthetic QE data.To reduce the difference between pseudo and real translations, we employ the constrained beam search algorithm and enhance translation diversity through the use of distinct generation models.DCSQE uses references—i.e., translation supervision signals—to guide both the generation and annotation processes, enhancing the quality of token-level labels.DCSQE further identifies the shortest phrase covering consecutive error tokens, mimicking human annotation behavior, to assign the final phrase-level labels.Specially, we underscore that the translation model can not annotate translations of itself accurately.Extensive experiments demonstrate that DCSQE outperforms SOTA baselines like CometKiwi in both supervised and unsupervised settings.Further analysis offers insights into synthetic data generation that could benefit reward models for other tasks.The code is available at https://github.com/NJUNLP/njuqe.

2024

pdf bib abs
大模型时代的多语言研究综述(A Survey of Multilingual Research in the Large Language Model Era)
Changjiang Gao (长江高) | Hao Zhou (昊周) | Shuaijie She (佘帅杰) | Haoming Zhong (钟昊鸣) | Sizhe Liu (斯哲刘) | Zhejian Lai (赖哲剑) | Zhijun Wang (王志军) | Shujian Huang (书剑黄)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)

“进入大语言模型时代以来,传统的多语言研究模式发生了巨大变化。一些传统任务得到了突破性的解决,也出现了多种新任务,以及许多以多语言大模型为基础、面向大模型能力提升的多语言研究工作。本文针对研究领域中的这一新变化,整理归纳了进入大模型时代以来的多语言研究进展,包括多语言大模型、数据集、任务,以及相关的前沿研究方向、研究挑战等,希望能为大模型范式下的多语言研究的未来发展提供参考和帮助。”

2023

Machine translation (MT) quality estimation (QE) is a crucial task to estimate the quality of MT outputs when reference translations are unavailable. Many studies focus on generating pseudo data using large parallel corpus and achieve remarkable success in the supervised setting. However, pseudo data solutions are less satisfying in unsupervised scenarios because the pseudo labels are inaccurate or the pseudo translations differ from the real ones. To address these problems, we propose to generate pseudo data using the MT model with constrained beam search (CBSQE). CBSQE preserves the reference parts with high MT probabilities as correct translations, while the rest parts as the wrong ones for MT generation. Therefore, CBSQE can reduce the false negative labels caused by synonyms. Overall, beam search will prefer a more real hypothesis with a higher MT generation likelihood. Extensive experiments demonstrate that CBSQE outperforms strong baselines in both supervised and unsupervised settings. Analyses further show the superiority of CBSQE. The code is available at https://github.com/NJUNLP/njuqe.

We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe). We generate pseudo MQM data using parallel data from the WMT translation task. We pre-train the XLMR large model on pseudo QE data, then fine-tune it on real QE data. At both stages, we jointly learn sentence-level scores and word-level tags. Empirically, we conduct experiments to find the key hyper-parameters that improve the performance. Technically, we propose a simple method that covert the word-level outputs to fine-grained error span results. Overall, our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks by a considerable margin.

Co-authors

Shimin Tao 2

Yu Zhang 2

Changjiang Gao (长江高) 1

Sizhe Liu (斯哲刘) 1

Zhijun Wang (王志军) 1

Haoming Zhong (钟昊鸣) 1

Hao Zhou (昊周) 1

Wei Zou 1

Venues

Fix author