Analysis of Transfer Learning for Named Entity Recognition in South-Slavic Languages
Nikola Ivačič, Thi Hong Hanh Tran, Boshko Koloski, Senja Pollak, Matthew Purver
Abstract
This paper analyzes a Named Entity Recognition task for South-Slavic languages using the pre-trained multilingual neural network models. We investigate whether the performance of the models for a target language can be improved by using data from closely related languages. We have shown that the model performance is not influenced substantially when trained with other than a target language. While for Slovene, the monolingual setting generally performs better, for Croatian and Serbian the results are slightly better in selected cross-lingual settings, but the improvements are not large. The most significant performance improvement is shown for the Serbian language, which has the smallest corpora. Therefore, fine-tuning with other closely related languages may benefit only the “low resource” languages.- Anthology ID:
- 2023.bsnlp-1.13
- Volume:
- Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Venue:
- BSNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 106–112
- Language:
- URL:
- https://aclanthology.org/2023.bsnlp-1.13
- DOI:
- Cite (ACL):
- Nikola Ivačič, Thi Hong Hanh Tran, Boshko Koloski, Senja Pollak, and Matthew Purver. 2023. Analysis of Transfer Learning for Named Entity Recognition in South-Slavic Languages. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023), pages 106–112, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Analysis of Transfer Learning for Named Entity Recognition in South-Slavic Languages (Ivačič et al., BSNLP 2023)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2023.bsnlp-1.13.pdf