What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

Wenhao Zhu, Shujian Huang, Yunzhe Lv, Xin Zheng, Jiajun Chen


Abstract
kNN-MT presents a new paradigm for domain adaptation by building an external datastore, which usually saves all target language token occurrences in the parallel corpus. As a result, the constructed datastore is usually large and possibly redundant. In this paper, we investigate the interpretability issue of this approach: what knowledge does the NMT model need? We propose the notion of local correctness (LAC) as a new angle, which describes the potential translation correctness for a single entry and for a given neighborhood. Empirical study shows that our investigation successfully finds the conditions where the NMT model could easily fail and need related knowledge. Experiments on six diverse target domains and two language-pairs show that pruning according to local correctness brings a light and more explainable memory for kNN-MT domain adaptation.
Anthology ID:
2023.findings-acl.177
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2824–2836
Language:
URL:
https://aclanthology.org/2023.findings-acl.177
DOI:
10.18653/v1/2023.findings-acl.177
Bibkey:
Cite (ACL):
Wenhao Zhu, Shujian Huang, Yunzhe Lv, Xin Zheng, and Jiajun Chen. 2023. What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2824–2836, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation (Zhu et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-2023-videos/2023.findings-acl.177.pdf