Abstract
This paper presents the submission of the UH&CU team (Joint University of Colorado and University of Helsinki team) for the VarDial 2018 shared task on morphosyntactic tagging of Croatian, Slovenian and Serbian tweets. Our system is a bidirectional LSTM tagger which emits tags as character sequences using an LSTM generator in order to be able to handle unknown tags and combinations of several tags for one token which occur in the shared task data sets. To the best of our knowledge, using an LSTM generator is a novel approach. The system delivers sizable improvements of more than 6%-points over a baseline trigram tagger. Overall, the performance of our system is quite even for all three languages.- Anthology ID:
- W18-3904
- Volume:
- Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Venue:
- VarDial
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 37–45
- Language:
- URL:
- https://aclanthology.org/W18-3904
- DOI:
- Cite (ACL):
- Miikka Silfverberg and Senka Drobac. 2018. Sub-label dependencies for Neural Morphological Tagging – The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 37–45, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Sub-label dependencies for Neural Morphological Tagging – The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018 (Silfverberg & Drobac, VarDial 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W18-3904.pdf
- Data
- MULTEXT-East