Abstract
We describe the process of conversion between the PoS tagging schemes of two languages, the Icelandic MIM-GOLD tagging scheme and the Faroese Sosialurin tagging scheme. These tagging schemes are functionally similar but use separate ways to encode fine-grained morphological information on tokenised text. As Faroese and Icelandic are lexically and grammatically similar, having a systematic method to convert between these two tagging schemes would be beneficial in the field of language technology, specifically in research on transfer learning between the two languages. As a product of our work, we present a provisional version of Icelandic corpora, prepared in the Faroese PoS tagging scheme, ready for use in cross-lingual NLP applications.- Anthology ID:
- 2021.nodalida-main.33
- Volume:
- Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
- Month:
- May 31--2 June
- Year:
- 2021
- Address:
- Reykjavik, Iceland (Online)
- Editors:
- Simon Dobnik, Lilja Øvrelid
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- Linköping University Electronic Press, Sweden
- Note:
- Pages:
- 321–325
- Language:
- URL:
- https://aclanthology.org/2021.nodalida-main.33
- DOI:
- Cite (ACL):
- Hinrik Hafsteinsson and Anton Karl Ingason. 2021. Towards cross-lingual application of language-specific PoS tagging schemes. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 321–325, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
- Cite (Informal):
- Towards cross-lingual application of language-specific PoS tagging schemes (Hafsteinsson & Ingason, NoDaLiDa 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2021.nodalida-main.33.pdf