Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography
Bjarki Ármannsson, Hinrik Hafsteinsson, Jóhannes B. Sigtryggsson, Atli Jasonarson, Einar Freyr Sigurðsson, Steinþór Steingrímsson
Abstract
We present the Icelandic Standardization Benchmark Set: Spelling and Punctuation (IceStaBS:SP), a dataset designed to provide standardized text examples for Icelandic orthography. The dataset includes non-standard orthography examples and their standardized counterparts, along with detailed explanations based on official Icelandic spelling rules. IceStaBS:SP aims to support the development and evaluation of automatic spell and grammar checkers, particularly in educational settings. We evaluate various spell and grammar checkers using IceStaBS:SP, demonstrating its utility as a benchmarking tool and highlighting areas for future improvement.- Anthology ID:
- 2025.nodalida-1.4
- Volume:
- Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
- Month:
- march
- Year:
- 2025
- Address:
- Tallinn, Estonia
- Editors:
- Richard Johansson, Sara Stymne
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- University of Tartu Library
- Note:
- Pages:
- 28–36
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.nodalida-1.4/
- DOI:
- Cite (ACL):
- Bjarki Ármannsson, Hinrik Hafsteinsson, Jóhannes B. Sigtryggsson, Atli Jasonarson, Einar Freyr Sigurðsson, and Steinþór Steingrímsson. 2025. Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 28–36, Tallinn, Estonia. University of Tartu Library.
- Cite (Informal):
- Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography (Ármannsson et al., NoDaLiDa 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.nodalida-1.4.pdf