The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation

David Stap; Christof Monz

doi:10.18653/v1/2025.findings-emnlp.224

The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation

Abstract

Prior research diverges on language diversity in LLM fine-tuning: Some studies report benefits while others find no advantages. Through controlled fine-tuning experiments across 132 translation directions, we systematically resolve these disparities. We find that expanding language diversity during fine-tuning improves translation quality for both unsupervised and—surprisingly—supervised pairs, despite less diverse models being fine-tuned exclusively on these supervised pairs. However, benefits plateau or decrease beyond a certain diversity threshold. We show that increased language diversity creates more language-agnostic representations. These representational adaptations help explain the improved performance in models fine-tuned with greater diversity.

Anthology ID:: 2025.findings-emnlp.224
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4199–4211
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.224/
DOI:: 10.18653/v1/2025.findings-emnlp.224
Bibkey:
Cite (ACL):: David Stap and Christof Monz. 2025. The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 4199–4211, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation (Stap & Monz, Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.224.pdf
Checklist:: 2025.findings-emnlp.224.checklist.pdf

PDF Cite Search Checklist Fix data