Coming to Your Senses: on Controls and Evaluation Sets in Polysemy Research

Haim Dubossarsky, Eitan Grossman, Daphna Weinshall


Abstract
The point of departure of this article is the claim that sense-specific vectors provide an advantage over normal vectors due to the polysemy that they presumably represent. This claim is based on performance gains observed in gold standard evaluation tests such as word similarity tasks. We demonstrate that this claim, at least as it is instantiated in prior art, is unfounded in two ways. Furthermore, we provide empirical data and an analytic discussion that may account for the previously reported improved performance. First, we show that ground-truth polysemy degrades performance in word similarity tasks. Therefore word similarity tasks are not suitable as an evaluation test for polysemy representation. Second, random assignment of words to senses is shown to improve performance in the same task. This and additional results point to the conclusion that performance gains as reported in previous work may be an artifact of random sense assignment, which is equivalent to sub-sampling and multiple estimation of word vector representations. Theoretical analysis shows that this may on its own be beneficial for the estimation of word similarity, by reducing the bias in the estimation of the cosine distance.
Anthology ID:
D18-1200
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1732–1740
Language:
URL:
https://aclanthology.org/D18-1200
DOI:
10.18653/v1/D18-1200
Bibkey:
Cite (ACL):
Haim Dubossarsky, Eitan Grossman, and Daphna Weinshall. 2018. Coming to Your Senses: on Controls and Evaluation Sets in Polysemy Research. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1732–1740, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Coming to Your Senses: on Controls and Evaluation Sets in Polysemy Research (Dubossarsky et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/D18-1200.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-2/D18-1200.mp4