Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few

Brad Aiken, Jared Kelly, Alexis Palmer, Suleyman Olcay Polat, Taraka Rama, Rodney Nielsen


Abstract
This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2: Morphological Analysis and Lemmatization in Context. Our core approach focuses on the morphological tagging task; part-of-speech tagging and lemmatization are treated as secondary tasks. Given the highly multilingual nature of the task, we propose an approach which makes minimal use of the supplied training data, in order to be extensible to languages without labeled training data for the morphological inflection task. Specifically, we use a parallel Bible corpus to align contextual embeddings at the verse level. The aligned verses are used to build cross-language translation matrices, which in turn are used to map between embedding spaces for the various languages. Finally, we use sets of inflected forms, primarily from a high-resource language, to induce vector representations for individual UniMorph tags. Morphological analysis is performed by matching vector representations to embeddings for individual tokens. While our system results are dramatically below the average system submitted for the shared task evaluation campaign, our method is (we suspect) unique in its minimal reliance on labeled training data.
Anthology ID:
W19-4211
Volume:
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Garrett Nicolai, Ryan Cotterell
Venue:
ACL
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
87–94
Language:
URL:
https://aclanthology.org/W19-4211
DOI:
10.18653/v1/W19-4211
Bibkey:
Cite (ACL):
Brad Aiken, Jared Kelly, Alexis Palmer, Suleyman Olcay Polat, Taraka Rama, and Rodney Nielsen. 2019. Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few. In Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 87–94, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few (Aiken et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/W19-4211.pdf