Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation
Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, JingBo Zhu
Abstract
Neural machine translation (NMT) has advanced significantly, yet challenges remain in adapting to new domains . In scenarios where bilingual data is limited, this issue is further exacerbated. To address this, we propose kNN-LM-NMT, a method that leverages semantically similar target language sentences in the kNN framework. Our approach generates a probability distribution over these sentences during decoding, and this distribution is then interpolated with the NMT model’s distribution. Additionally, we introduce an n-gram-based approach to focus on similar fragments, enabling the model to avoid the noise introduced by the non-similar parts. To enhance accuracy, we further incorporate cross-lingual retrieval similarity to refine the kNN probability distribution. Extensive experiments on multi-domain datasets demonstrate significant performance improvements in both high-resource and low-resource scenarios. Our approach effectively extracts translation knowledge from limited target domain data, and well benefits from large-scale monolingual data for robust context representation.- Anthology ID:
- 2025.acl-long.496
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10053–10065
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.496/
- DOI:
- Cite (ACL):
- Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, and JingBo Zhu. 2025. Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10053–10065, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation (Reheman et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.496.pdf