Improved Norwegian Bokmål Translations for FLORES

Petter Mæhlum, Anders Næss Evensen, Yves Scherrer


Abstract
FLORES+ is a collection of parallel datasets obtained by translation from originally English source texts. FLORES+ contains Norwegian translations for the two official written variants of Norwegian: Norwegian Bokmål and Norwegian Nynorsk. However, the earliest Bokmål version contained non-native-like mistakes, and even after a later revision, the dataset contained grammatical and lexical errors. This paper aims at correcting unambiguous mistakes, and thus creating a new version of the Bokmål dataset. At the same time, we provide a translation into Radical Bokmål, a sub-variety of Norwegian which is closer to Nynorsk in some aspects, while still being within the official norms for Bokmål. We discuss existing errors and differences in the various translations and the corrections that we provide.
Anthology ID:
2025.wmt-1.86
Volume:
Proceedings of the Tenth Conference on Machine Translation
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1124–1132
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.86/
DOI:
Bibkey:
Cite (ACL):
Petter Mæhlum, Anders Næss Evensen, and Yves Scherrer. 2025. Improved Norwegian Bokmål Translations for FLORES. In Proceedings of the Tenth Conference on Machine Translation, pages 1124–1132, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Improved Norwegian Bokmål Translations for FLORES (Mæhlum et al., WMT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.wmt-1.86.pdf