Leveraging Gloss Knowledge in Neural Word Sense Disambiguation by Hierarchical Co-Attention
Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, Baobao Chang
Abstract
The goal of Word Sense Disambiguation (WSD) is to identify the correct meaning of a word in the particular context. Traditional supervised methods only use labeled data (context), while missing rich lexical knowledge such as the gloss which defines the meaning of a word sense. Recent studies have shown that incorporating glosses into neural networks for WSD has made significant improvement. However, the previous models usually build the context representation and gloss representation separately. In this paper, we find that the learning for the context and gloss representation can benefit from each other. Gloss can help to highlight the important words in the context, thus building a better context representation. Context can also help to locate the key words in the gloss of the correct word sense. Therefore, we introduce a co-attention mechanism to generate co-dependent representations for the context and gloss. Furthermore, in order to capture both word-level and sentence-level information, we extend the attention mechanism in a hierarchical fashion. Experimental results show that our model achieves the state-of-the-art results on several standard English all-words WSD test datasets.- Anthology ID:
- D18-1170
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1402–1411
- Language:
- URL:
- https://aclanthology.org/D18-1170
- DOI:
- 10.18653/v1/D18-1170
- Cite (ACL):
- Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, and Baobao Chang. 2018. Leveraging Gloss Knowledge in Neural Word Sense Disambiguation by Hierarchical Co-Attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1402–1411, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Leveraging Gloss Knowledge in Neural Word Sense Disambiguation by Hierarchical Co-Attention (Luo et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/D18-1170.pdf
- Data
- Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison