An Icelandic Linguistic Benchmark for Large Language Models

Bjarki Ármannsson, Finnur Ágúst Ingimundarson, Einar Freyr Sigurðsson


Abstract
This paper introduces a linguistic benchmark for Icelandic-language LLMs, the first of its kind manually constructed by native speakers. We report on the scores obtained by current state-of-the-art models, which indicate room for improvement, and discuss the theoretical problems involved in creating such a benchmark and scoring a model’s performance.
Anthology ID:
2025.nodalida-1.5
Volume:
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Richard Johansson, Sara Stymne
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
37–47
Language:
URL:
https://preview.aclanthology.org/corrections-2025-06/2025.nodalida-1.5/
DOI:
Bibkey:
Cite (ACL):
Bjarki Ármannsson, Finnur Ágúst Ingimundarson, and Einar Freyr Sigurðsson. 2025. An Icelandic Linguistic Benchmark for Large Language Models. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 37–47, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
An Icelandic Linguistic Benchmark for Large Language Models (Ármannsson et al., NoDaLiDa 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-06/2025.nodalida-1.5.pdf