Neuron-Level Language Tag Injection Improves Zero-Shot Translation Performance

Jay Orten, Ammon Shurtz, Nancy Fulda, Stephen D. Richardson


Abstract
Language tagging, a method whereby source and target inputs are prefixed with a unique language token, has become the de facto standard for conditioning Multilingual Neural Machine Translation (MNMT) models on specific language directions. This conditioning can manifest effective zero-shot translation abilities in MT models at scale for many languages. Expanding on previous work, we propose a novel method of language tagging for MNMT, injection, in which the embedded representation of a language token is concatenated to the input of every linear layer. We explore a variety of different tagging methods, with and without injection, showing that injection improves zero-shot translation performance with up to a 2+ BLEU score point gain for certain language directions in our dataset.
Anthology ID:
2025.acl-srw.13
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Jin Zhao, Mingyang Wang, Zhu Liu
Venues:
ACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
203–212
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.acl-srw.13/
DOI:
Bibkey:
Cite (ACL):
Jay Orten, Ammon Shurtz, Nancy Fulda, and Stephen D. Richardson. 2025. Neuron-Level Language Tag Injection Improves Zero-Shot Translation Performance. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 203–212, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Neuron-Level Language Tag Injection Improves Zero-Shot Translation Performance (Orten et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.acl-srw.13.pdf