Multi-label Scandinavian Language Identification (SLIDE)

Mariia Fedorova, Jonas Sebulon Frydenberg, Victoria Handford, Victoria Ovedie Chruickshank Langø, Solveig Helene Willoch, Marthe Løken Midtgaard, Yves Scherrer, Petter Mæhlum, David Samuel


Abstract
Identifying closely related languages at sentence level is difficult, in particular because it is often impossible to assign a sentence to a single language. In this paper, we focus on multi-label sentence-level Scandinavian language identification (LID) for Danish, Norwegian Bokmål, Norwegian Nynorsk, and Swedish. We present the Scandinavian Language Identification and Evaluation, SLIDE, a manually curated multi-label evaluation dataset and a suite of LID models with varying speed–accuracy tradeoffs. We demonstrate that the ability to identify multiple languages simultaneously is necessary for any accurate LID method, and present a novel approach to training such multi-label LID models.
Anthology ID:
2025.resourceful-1.33
Volume:
Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025)
Month:
March
Year:
2025
Address:
Tallinn, Estonia
Editors:
Špela Arhar Holdt, Nikolai Ilinykh, Barbara Scalvini, Micaella Bruton, Iben Nyholm Debess, Crina Madalina Tudor
Venues:
RESOURCEFUL | WS
SIG:
Publisher:
University of Tartu Library, Estonia
Note:
Pages:
179–189
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.resourceful-1.33/
DOI:
Bibkey:
Cite (ACL):
Mariia Fedorova, Jonas Sebulon Frydenberg, Victoria Handford, Victoria Ovedie Chruickshank Langø, Solveig Helene Willoch, Marthe Løken Midtgaard, Yves Scherrer, Petter Mæhlum, and David Samuel. 2025. Multi-label Scandinavian Language Identification (SLIDE). In Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025), pages 179–189, Tallinn, Estonia. University of Tartu Library, Estonia.
Cite (Informal):
Multi-label Scandinavian Language Identification (SLIDE) (Fedorova et al., RESOURCEFUL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.resourceful-1.33.pdf