Simple Embedding-Based Word Sense Disambiguation

Dieke Oele, Gertjan van Noord


Abstract
We present a simple knowledge-based WSD method that uses word and sense embeddings to compute the similarity between the gloss of a sense and the context of the word. Our method is inspired by the Lesk algorithm as it exploits both the context of the words and the definitions of the senses. It only requires large unlabeled corpora and a sense inventory such as WordNet, and therefore does not rely on annotated data. We explore whether additional extensions to Lesk are compatible with our method. The results of our experiments show that by lexically extending the amount of words in the gloss and context, although it works well for other implementations of Lesk, harms our method. Using a lexical selection method on the context words, on the other hand, improves it. The combination of our method with lexical selection enables our method to outperform state-of the art knowledge-based systems.
Anthology ID:
2018.gwc-1.30
Volume:
Proceedings of the 9th Global Wordnet Conference
Month:
January
Year:
2018
Address:
Nanyang Technological University (NTU), Singapore
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
259–265
Language:
URL:
https://aclanthology.org/2018.gwc-1.30
DOI:
Bibkey:
Cite (ACL):
Dieke Oele and Gertjan van Noord. 2018. Simple Embedding-Based Word Sense Disambiguation. In Proceedings of the 9th Global Wordnet Conference, pages 259–265, Nanyang Technological University (NTU), Singapore. Global Wordnet Association.
Cite (Informal):
Simple Embedding-Based Word Sense Disambiguation (Oele & van Noord, GWC 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/2018.gwc-1.30.pdf
Data
Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison