Improving neural tagging with lexical information

Benoît Sagot, Héctor Martínez Alonso


Abstract
Neural part-of-speech tagging has achieved competitive results with the incorporation of character-based and pre-trained word embeddings. In this paper, we show that a state-of-the-art bi-LSTM tagger can benefit from using information from morphosyntactic lexicons as additional input. The tagger, trained on several dozen languages, shows a consistent, average improvement when using lexical information, even when also using character-based embeddings, thus showing the complementarity of the different sources of lexical information. The improvements are particularly important for the smaller datasets.
Anthology ID:
W17-6304
Volume:
Proceedings of the 15th International Conference on Parsing Technologies
Month:
September
Year:
2017
Address:
Pisa, Italy
Editors:
Yusuke Miyao, Kenji Sagae
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
25–31
Language:
URL:
https://aclanthology.org/W17-6304
DOI:
Bibkey:
Cite (ACL):
Benoît Sagot and Héctor Martínez Alonso. 2017. Improving neural tagging with lexical information. In Proceedings of the 15th International Conference on Parsing Technologies, pages 25–31, Pisa, Italy. Association for Computational Linguistics.
Cite (Informal):
Improving neural tagging with lexical information (Sagot & Martínez Alonso, IWPT 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/W17-6304.pdf
Data
MULTEXT-East