Finetuning LLMs for EvaCun 2025 token prediction shared task

Josef Jon, Ondřej Bojar


Abstract
In this paper, we present our submission for the token prediction task of EvaCun 2025. Our sys-tems are based on LLMs (Command-R, Mistral, and Aya Expanse) fine-tuned on the task data provided by the organizers. As we only pos-sess a very superficial knowledge of the subject field and the languages of the task, we simply used the training data without any task-specific adjustments, preprocessing, or filtering. We compare 3 different approaches (based on 3 different prompts) of obtaining the predictions, and we evaluate them on a held-out part of the data.
Anthology ID:
2025.alp-1.29
Volume:
Proceedings of the Second Workshop on Ancient Language Processing
Month:
May
Year:
2025
Address:
The Albuquerque Convention Center, Laguna
Editors:
Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti, Rachele Sprugnoli
Venues:
ALP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
221–225
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.alp-1.29/
DOI:
Bibkey:
Cite (ACL):
Josef Jon and Ondřej Bojar. 2025. Finetuning LLMs for EvaCun 2025 token prediction shared task. In Proceedings of the Second Workshop on Ancient Language Processing, pages 221–225, The Albuquerque Convention Center, Laguna. Association for Computational Linguistics.
Cite (Informal):
Finetuning LLMs for EvaCun 2025 token prediction shared task (Jon & Bojar, ALP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.alp-1.29.pdf