Modular Monolingual Adaptation using Pretrained Language Models

Nalin Kumar, Ondrej Dusek


Abstract
Building monolingual language models (LMs) for low-resource languages typically relies on adapting pretrained language models (PLMs) by finetuning the whole model on the target language. This approach is widely favored over training from scratch, as it enables effective knowledge transfer. Additionally, prior work has shown that using a language-specific tokenizer can enhance the adaptability. In this work, we hypothesize that full model tuning is often unnecessary and propose a more modular approach. Specifically, we replace the tokens, freeze the corresponding embeddings, and tune the rest of the model. We use Scottish Gaelic, Irish, and Quechua for our experiments, with Quechua being a very low-resource language (8.5k training instances). Evaluation on natural language understanding (NLU) tasks – mask-filling, NER, and POS – shows that our proposed approach improves performance when adapting the models to low-resource languages. Additionally, we provide a comprehensive analysis of the effectiveness of training strategies, the choice of pretrained embeddings, and models.
Anthology ID:
2026.acl-industry.125
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yunyao Li, Georg Rehm, Mei Tu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1819–1828
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.125/
DOI:
Bibkey:
Cite (ACL):
Nalin Kumar and Ondrej Dusek. 2026. Modular Monolingual Adaptation using Pretrained Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1819–1828, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Modular Monolingual Adaptation using Pretrained Language Models (Kumar & Dusek, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-industry.125.pdf