Towards cross-lingual application of language-specific PoS tagging schemes

Hinrik Hafsteinsson, Anton Karl Ingason


Abstract
We describe the process of conversion between the PoS tagging schemes of two languages, the Icelandic MIM-GOLD tagging scheme and the Faroese Sosialurin tagging scheme. These tagging schemes are functionally similar but use separate ways to encode fine-grained morphological information on tokenised text. As Faroese and Icelandic are lexically and grammatically similar, having a systematic method to convert between these two tagging schemes would be beneficial in the field of language technology, specifically in research on transfer learning between the two languages. As a product of our work, we present a provisional version of Icelandic corpora, prepared in the Faroese PoS tagging scheme, ready for use in cross-lingual NLP applications.
Anthology ID:
2021.nodalida-main.33
Volume:
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May 31--2 June
Year:
2021
Address:
Reykjavik, Iceland (Online)
Editors:
Simon Dobnik, Lilja Øvrelid
Venue:
NoDaLiDa
SIG:
Publisher:
Linköping University Electronic Press, Sweden
Note:
Pages:
321–325
Language:
URL:
https://aclanthology.org/2021.nodalida-main.33
DOI:
Bibkey:
Cite (ACL):
Hinrik Hafsteinsson and Anton Karl Ingason. 2021. Towards cross-lingual application of language-specific PoS tagging schemes. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 321–325, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
Cite (Informal):
Towards cross-lingual application of language-specific PoS tagging schemes (Hafsteinsson & Ingason, NoDaLiDa 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2021.nodalida-main.33.pdf