Abstract
Word Sense Disambiguation (WSD) is a long-standing but open problem in Natural Language Processing (NLP). WSD corpora are typically small in size, owing to an expensive annotation process. Current supervised WSD methods treat senses as discrete labels and also resort to predicting the Most-Frequent-Sense (MFS) for words unseen during training. This leads to poor performance on rare and unseen senses. To overcome this challenge, we propose Extended WSD Incorporating Sense Embeddings (EWISE), a supervised model to perform WSD by predicting over a continuous sense embedding space as opposed to a discrete label space. This allows EWISE to generalize over both seen and unseen senses, thus achieving generalized zero-shot learning. To obtain target sense embeddings, EWISE utilizes sense definitions. EWISE learns a novel sentence encoder for sense definitions by using WordNet relations and also ConvE, a recently proposed knowledge graph embedding method. We also compare EWISE against other sentence encoders pretrained on large corpora to generate definition embeddings. EWISE achieves new state-of-the-art WSD performance.- Anthology ID:
- P19-1568
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5670–5681
- Language:
- URL:
- https://aclanthology.org/P19-1568
- DOI:
- 10.18653/v1/P19-1568
- Cite (ACL):
- Sawan Kumar, Sharmistha Jat, Karan Saxena, and Partha Talukdar. 2019. Zero-shot Word Sense Disambiguation using Sense Definition Embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5670–5681, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Zero-shot Word Sense Disambiguation using Sense Definition Embeddings (Kumar et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/P19-1568.pdf
- Code
- malllabiisc/EWISE
- Data
- Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison