Finnur Ágúst Ingimundarson
2025
An Icelandic Linguistic Benchmark for Large Language Models
Bjarki Ármannsson
|
Finnur Ágúst Ingimundarson
|
Einar Freyr Sigurðsson
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
This paper introduces a linguistic benchmark for Icelandic-language LLMs, the first of its kind manually constructed by native speakers. We report on the scores obtained by current state-of-the-art models, which indicate room for improvement, and discuss the theoretical problems involved in creating such a benchmark and scoring a model’s performance.