UZH at SemEval-2020 Task 3: Combining BERT with WordNet Sense Embeddings to Predict Graded Word Similarity Changes

Li Tang


Abstract
CoSimLex is a dataset that can be used to evaluate the ability of context-dependent word embed- dings for modeling subtle, graded changes of meaning, as perceived by humans during reading. At SemEval-2020, task 3, subtask 1 is about ”predicting the (graded) effect of context in word similarity”, using CoSimLex to quantify such a change of similarity for a pair of words, from one context to another. Here, a meaning shift is composed of two aspects, a) discrete changes observed between different word senses, and b) more subtle changes of meaning representation that are not captured in those discrete changes. Therefore, this SemEval task was designed to allow the evaluation of systems that can deal with a mix of both situations of semantic shift, as they occur in the human perception of meaning. The described system was developed to improve the BERT baseline provided with the task, by reducing distortions in the BERT semantic space, compared to the human semantic space. To this end, complementarity between 768- and 1024-dimensional BERT embeddings, and average word sense vectors were used. With this system, after some fine-tuning, the baseline performance of 0.705 (uncentered Pearson correlation with human semantic shift data from 27 annotators) was enhanced by more than 6%, to 0.7645. We hope that this work can make a contribution to further our understanding of the semantic vector space of human perception, as it can be modeled with context-dependent word embeddings in natural language processing systems.
Anthology ID:
2020.semeval-1.19
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venues:
COLING | SemEval
SIGs:
SIGLEX | SIGSEM
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
166–170
Language:
URL:
https://aclanthology.org/2020.semeval-1.19
DOI:
10.18653/v1/2020.semeval-1.19
Bibkey:
Cite (ACL):
Li Tang. 2020. UZH at SemEval-2020 Task 3: Combining BERT with WordNet Sense Embeddings to Predict Graded Word Similarity Changes. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 166–170, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
UZH at SemEval-2020 Task 3: Combining BERT with WordNet Sense Embeddings to Predict Graded Word Similarity Changes (Tang, SemEval 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.semeval-1.19.pdf