Detecting Most Frequent Sense using Word Embeddings and BabelNet

Harpreet Singh Arora, Sudha Bhingardive, Pushpak Bhattacharyya


Abstract
Since the inception of the SENSEVAL evaluation exercises there has been a great deal of recent research into Word Sense Disambiguation (WSD). Over the years, various supervised, unsupervised and knowledge based WSD systems have been proposed. Beating the first sense heuristics is a challenging task for these systems. In this paper, we present our work on Most Frequent Sense (MFS) detection using Word Embeddings and BabelNet features. The semantic features from BabelNet viz., synsets, gloss, relations, etc. are used for generating sense embeddings. We compare word embedding of a word with its sense embeddings to obtain the MFS with the highest similarity. The MFS is detected for six languages viz., English, Spanish, Russian, German, French and Italian. However, this approach can be applied to any language provided that word embeddings are available for that language.
Anthology ID:
2016.gwc-1.4
Volume:
Proceedings of the 8th Global WordNet Conference (GWC)
Month:
27--30 January
Year:
2016
Address:
Bucharest, Romania
Editors:
Christiane Fellbaum, Piek Vossen, Verginica Barbu Mititelu, Corina Forascu
Venue:
GWC
SIG:
SIGLEX
Publisher:
Global Wordnet Association
Note:
Pages:
21–25
Language:
URL:
https://aclanthology.org/2016.gwc-1.4
DOI:
Bibkey:
Cite (ACL):
Harpreet Singh Arora, Sudha Bhingardive, and Pushpak Bhattacharyya. 2016. Detecting Most Frequent Sense using Word Embeddings and BabelNet. In Proceedings of the 8th Global WordNet Conference (GWC), pages 21–25, Bucharest, Romania. Global Wordnet Association.
Cite (Informal):
Detecting Most Frequent Sense using Word Embeddings and BabelNet (Arora et al., GWC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2016.gwc-1.4.pdf