Don’t Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation

Daniel Loureiro, Jose Camacho-Collados


Abstract
State-of-the-art methods for Word Sense Disambiguation (WSD) combine two different features: the power of pre-trained language models and a propagation method to extend the coverage of such models. This propagation is needed as current sense-annotated corpora lack coverage of many instances in the underlying sense inventory (usually WordNet). At the same time, unambiguous words make for a large portion of all words in WordNet, while being poorly covered in existing sense-annotated corpora. In this paper, we propose a simple method to provide annotations for most unambiguous words in a large corpus. We introduce the UWA (Unambiguous Word Annotations) dataset and show how a state-of-the-art propagation-based model can use it to extend the coverage and quality of its word sense embeddings by a significant margin, improving on its original results on WSD.
Anthology ID:
2020.emnlp-main.283
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3514–3520
Language:
URL:
https://aclanthology.org/2020.emnlp-main.283
DOI:
10.18653/v1/2020.emnlp-main.283
Bibkey:
Cite (ACL):
Daniel Loureiro and Jose Camacho-Collados. 2020. Don’t Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3514–3520, Online. Association for Computational Linguistics.
Cite (Informal):
Don’t Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation (Loureiro & Camacho-Collados, EMNLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2020.emnlp-main.283.pdf
Video:
 https://slideslive.com/38939011