Abstract
Neural part-of-speech tagging has achieved competitive results with the incorporation of character-based and pre-trained word embeddings. In this paper, we show that a state-of-the-art bi-LSTM tagger can benefit from using information from morphosyntactic lexicons as additional input. The tagger, trained on several dozen languages, shows a consistent, average improvement when using lexical information, even when also using character-based embeddings, thus showing the complementarity of the different sources of lexical information. The improvements are particularly important for the smaller datasets.- Anthology ID:
- W17-6304
- Volume:
- Proceedings of the 15th International Conference on Parsing Technologies
- Month:
- September
- Year:
- 2017
- Address:
- Pisa, Italy
- Editors:
- Yusuke Miyao, Kenji Sagae
- Venue:
- IWPT
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 25–31
- Language:
- URL:
- https://aclanthology.org/W17-6304
- DOI:
- Cite (ACL):
- Benoît Sagot and Héctor Martínez Alonso. 2017. Improving neural tagging with lexical information. In Proceedings of the 15th International Conference on Parsing Technologies, pages 25–31, Pisa, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Improving neural tagging with lexical information (Sagot & Martínez Alonso, IWPT 2017)
- PDF:
- https://preview.aclanthology.org/landing_page/W17-6304.pdf
- Data
- MULTEXT-East