融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density)

Rui Qi (齐睿), Xiangyu Shi (石响宇), Zhibo Man (满志博), Jinan Xu (徐金安), Yufeng Chen (陈钰枫)


Abstract
“k-最近邻机器翻译(kNN-MT)是近年来神经机器翻译领域的一个重要研究方向。此类方法可以在不更新机器翻译模型的情况下提高翻译质量,但训练数据中高低频单词的数量不均衡限制了模型效果,且固定的k值无法对处于不同密度分布的数据都产生良好的翻译结果。为此本文提出了一种创新的kNN-MT方法,引入确定性因子(CF)来降低数据不均衡对模型效果的影响,并根据测试点周边数据密度动态选择k值。在多领域德-英翻译数据集上,相比基线实验,本方法在四个领域上翻译效果均有提升,其中三个领域上提升超过1个BLEU,有效提高了神经机器翻译模型的翻译质量。”
Anthology ID:
2024.ccl-1.16
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Sun Maosong, Liang Jiye, Han Xianpei, Liu Zhiyuan, He Yulan
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
217–229
Language:
Chinese
URL:
https://preview.aclanthology.org/author-degibert/2024.ccl-1.16/
DOI:
Bibkey:
Cite (ACL):
Rui Qi, Xiangyu Shi, Zhibo Man, Jinan Xu, and Yufeng Chen. 2024. 融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density). In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 217–229, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density) (Qi et al., CCL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-degibert/2024.ccl-1.16.pdf