A Cross-Lingual Perspective on Neural Machine Translation Difficulty

Esther Ploeger, Johannes Bjerva, Jörg Tiedemann, Robert Oestling


Abstract
Intuitively, machine translation (MT) between closely related languages, such as Swedish and Danish, is easier than MT between more distant pairs, such as Finnish and Danish. Yet, the notions of ‘closely related’ languages and ‘easier’ translation have so far remained underspecified. Moreover, in the context of neural MT, this assumption was almost exclusively evaluated in scenarios where English was either the source or target language, leaving a broader cross-lingual view unexplored. In this work, we present a controlled study of language similarity and neural MT difficulty for 56 European translation directions. We test a range of language similarity metrics, some of which are reasonable predictors of MT difficulty. On a text-level, we reassess previously introduced indicators of MT difficulty, and find that they are not well-suited to our domain, or neural MT more generally. Ultimately, we hope that this work inspires further cross-lingual investigations of neural MT difficulty
Anthology ID:
2025.wmt-1.21
Volume:
Proceedings of the Tenth Conference on Machine Translation
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
340–354
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.21/
DOI:
Bibkey:
Cite (ACL):
Esther Ploeger, Johannes Bjerva, Jörg Tiedemann, and Robert Oestling. 2025. A Cross-Lingual Perspective on Neural Machine Translation Difficulty. In Proceedings of the Tenth Conference on Machine Translation, pages 340–354, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
A Cross-Lingual Perspective on Neural Machine Translation Difficulty (Ploeger et al., WMT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.21.pdf