Fine-tuning and evaluation of NMT models for literary texts using RomCro v.2.0

Bojana Mikelenić, Antoni Oliver, Sergi Àlvarez Vidal


Abstract
This paper explores the fine-tuning and evaluation of neural machine translation (NMT) models for literary texts using RomCro v.2.0, an expanded multilingual and multidirectional parallel corpus. RomCro v.2.0 is based on RomCro v.1.0, but includes additional literary works, as well as texts in Catalan, making it a valuable resource for improving MT in underrepresented language pairs. Given the challenges of literary translation, where style, narrative voice, and cultural nuances must be preserved, fine-tuning on high-quality domain-specific data is essential for enhancing MT performance. We fine-tune existing NMT models with RomCro v.2.0 and evaluate their performance for six different language combinations using automatic metrics and for Spanish-Croatian and French-Catalan using manual evaluation. Results indicate that fine-tuned models outperform general-purpose systems, achieving greater fluency and stylistic coherence. These findings support the effectiveness of corpus-driven fine-tuning for literary translation and highlight the importance of curated high-quality corpus.
Anthology ID:
2025.ctt-1.4
Volume:
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)
Month:
June
Year:
2025
Address:
Geneva, Switzerland
Editors:
Bram Vanroy, Marie-Aude Lefer, Lieve Macken, Paola Ruffo, Ana Guerberof Arenas, Damien Hansen
Venue:
CTT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
44–51
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.ctt-1.4/
DOI:
Bibkey:
Cite (ACL):
Bojana Mikelenić, Antoni Oliver, and Sergi Àlvarez Vidal. 2025. Fine-tuning and evaluation of NMT models for literary texts using RomCro v.2.0. In Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT), pages 44–51, Geneva, Switzerland. European Association for Machine Translation.
Cite (Informal):
Fine-tuning and evaluation of NMT models for literary texts using RomCro v.2.0 (Mikelenić et al., CTT 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.ctt-1.4.pdf