Łukasz Gągała

Also published as: Łukasz Gagała

2025

pdf bib
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History
Yevhen Kostiuk | Oxana Vitman | Łukasz Gągała | Artur Kiulian
Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)

pdf bib
Evaluating LLM Judgment on Latvian and Lithuanian Short Answer Matching
Yevhen Kostiuk | Oxana Vitman | Łukasz Gągała | Artur Kiulian
Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)

In this paper, we propose a model-agnostic cost-effective approach to developing bilingual base large language models (LLMs) to support English and any target language. The method includes vocabulary expansion, initialization of new embeddings, model training and evaluation. We performed our experiments with three languages, each using a non-Latin script—Ukrainian, Arabic, and Georgian.Our approach demonstrates improved language performance while reducing computational costs. It mitigates the disproportionate penalization of underrepresented languages, promoting fairness and minimizing adverse phenomena such as code-switching and broken grammar. Additionally, we introduce new metrics to evaluate language quality, revealing that vocabulary size significantly impacts the quality of generated text.

Co-authors

Dmytro Chaplynskyi 1

Guillermo Gabrielli 1

Venues

Fix author