Abstract
We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, characterand word-level gap-filling. We developed a simple, uniform, and computationally lightweight approach based on the adapters framework using parameter-efficient fine-tuning. We applied the same adapter-based approach uniformly to all tasks and 16 languages by fine-tuning stacked language- and task-specific adapters. Our submission obtained an overall second place out of three submissions, with the first place in word-level gap-filling. Our results show the feasibility of adapting language models pre-trained on modern languages to historical and ancient languages via adapter training.- Anthology ID:
- 2024.sigtyp-1.15
- Volume:
- Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian's, Malta
- Editors:
- Michael Hahn, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Yulia Otmakhova, Jinrui Yang, Oleg Serikov, Priya Rani, Edoardo M. Ponti, Saliha Muradoğlu, Rena Gao, Ryan Cotterell, Ekaterina Vylomova
- Venues:
- SIGTYP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 120–130
- Language:
- URL:
- https://aclanthology.org/2024.sigtyp-1.15
- DOI:
- Cite (ACL):
- Aleksei Dorkin and Kairit Sirts. 2024. TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages. In Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 120–130, St. Julian's, Malta. Association for Computational Linguistics.
- Cite (Informal):
- TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages (Dorkin & Sirts, SIGTYP-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2024.sigtyp-1.15.pdf