融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density)
Qi Rui (齐睿), Shi Xiangyu (石响宇), Man Zhibo (满志博), Xu Jinan (徐金安), Chen Yufeng (陈钰枫)
Abstract
“k-最近邻机器翻译(kNN-MT)是近年来神经机器翻译领域的一个重要研究方向。此类方法可以在不更新机器翻译模型的情况下提高翻译质量,但训练数据中高低频单词的数量不均衡限制了模型效果,且固定的k值无法对处于不同密度分布的数据都产生良好的翻译结果。为此本文提出了一种创新的kNN-MT方法,引入确定性因子(CF)来降低数据不均衡对模型效果的影响,并根据测试点周边数据密度动态选择k值。在多领域德-英翻译数据集上,相比基线实验,本方法在四个领域上翻译效果均有提升,其中三个领域上提升超过1个BLEU,有效提高了神经机器翻译模型的翻译质量。”- Anthology ID:
- 2024.ccl-1.16
- Volume:
- Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
- Month:
- July
- Year:
- 2024
- Address:
- Taiyuan, China
- Editors:
- Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 217–229
- Language:
- Chinese
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.ccl-1.16/
- DOI:
- Cite (ACL):
- Qi Rui, Shi Xiangyu, Man Zhibo, Xu Jinan, and Chen Yufeng. 2024. 融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density). In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 217–229, Taiyuan, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density) (Rui et al., CCL 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.ccl-1.16.pdf