Multi-Team: A Multi-attention, Multi-decoder Approach to Morphological Analysis.

Ahmet Üstün, Rob van der Goot, Gosse Bouma, Gertjan van Noord


Abstract
This paper describes our submission to SIGMORPHON 2019 Task 2: Morphological analysis and lemmatization in context. Our model is a multi-task sequence to sequence neural network, which jointly learns morphological tagging and lemmatization. On the encoding side, we exploit character-level as well as contextual information. We introduce a multi-attention decoder to selectively focus on different parts of character and word sequences. To further improve the model, we train on multiple datasets simultaneously and use external embeddings for initialization. Our final model reaches an average morphological tagging F1 score of 94.54 and a lemma accuracy of 93.91 on the test data, ranking respectively 3rd and 6th out of 13 teams in the SIGMORPHON 2019 shared task.
Anthology ID:
W19-4206
Volume:
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
35–49
Language:
URL:
https://aclanthology.org/W19-4206
DOI:
10.18653/v1/W19-4206
Bibkey:
Cite (ACL):
Ahmet Üstün, Rob van der Goot, Gosse Bouma, and Gertjan van Noord. 2019. Multi-Team: A Multi-attention, Multi-decoder Approach to Morphological Analysis.. In Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 35–49, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Multi-Team: A Multi-attention, Multi-decoder Approach to Morphological Analysis. (Üstün et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W19-4206.pdf