THOMAS: The Hegemonic OSU Morphological Analyzer using Seq2seq

Byung-Doh Oh, Pranav Maneriker, Nanjiang Jiang


Abstract
This paper describes the OSU submission to the SIGMORPHON 2019 shared task, Crosslinguality and Context in Morphology. Our system addresses the contextual morphological analysis subtask of Task 2, which is to produce the morphosyntactic description (MSD) of each fully inflected word within a given sentence. We frame this as a sequence generation task and employ a neural encoder-decoder (seq2seq) architecture to generate the sequence of MSD tags given the encoded representation of each token. Follow-up analyses reveal that our system most significantly improves performance on morphologically complex languages whose inflected word forms typically have longer MSD tag sequences. In addition, our system seems to capture the structured correlation between MSD tags, such as that between the “verb” tag and TAM-related tags.
Anthology ID:
W19-4210
Volume:
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Garrett Nicolai, Ryan Cotterell
Venue:
ACL
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
80–86
Language:
URL:
https://aclanthology.org/W19-4210
DOI:
10.18653/v1/W19-4210
Bibkey:
Cite (ACL):
Byung-Doh Oh, Pranav Maneriker, and Nanjiang Jiang. 2019. THOMAS: The Hegemonic OSU Morphological Analyzer using Seq2seq. In Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 80–86, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
THOMAS: The Hegemonic OSU Morphological Analyzer using Seq2seq (Oh et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/W19-4210.pdf