Abstract
What do people know when they know the meaning of words? Word associations have been widely used to tap into lexical repre- sentations and their structure, as a way of probing semantic knowledge in humans. We investigate whether current word embedding spaces (contextualized and uncontextualized) can be considered good models of human lexi- cal knowledge by studying whether they have comparable characteristics to human associa- tion spaces. We study the three properties of association rank, asymmetry of similarity and triangle inequality. We find that word embeddings are good mod- els of some word associations properties. They replicate well human associations between words, and, like humans, their context-aware variants show violations of the triangle in- equality. While they do show asymmetry of similarities, their asymmetries do not map those of human association norms.- Anthology ID:
- 2020.conll-1.30
- Volume:
- Proceedings of the 24th Conference on Computational Natural Language Learning
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 376–385
- Language:
- URL:
- https://aclanthology.org/2020.conll-1.30
- DOI:
- 10.18653/v1/2020.conll-1.30
- Cite (ACL):
- Maria A. Rodriguez and Paola Merlo. 2020. Word associations and the distance properties of context-aware word embeddings. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 376–385, Online. Association for Computational Linguistics.
- Cite (Informal):
- Word associations and the distance properties of context-aware word embeddings (A. Rodriguez & Merlo, CoNLL 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.conll-1.30.pdf
- Data
- BookCorpus