Abstract
Compositional Distributional Semantic Models (CDSMs) model the meaning of phrases and sentences in vector space. They have been predominantly evaluated on limited, artificial tasks such as semantic sentence similarity on hand-constructed datasets. This paper argues for lexical substitution (LexSub) as a means to evaluate CDSMs. LexSub is a more natural task, enables us to evaluate meaning composition at the level of individual words, and provides a common ground to compare CDSMs with dedicated LexSub models. We create a LexSub dataset for CDSM evaluation from a corpus with manual “all-words” LexSub annotation. Our experiments indicate that the Practical Lexical Function CDSM outperforms simple component-wise CDSMs and performs on par with the context2vec LexSub model using the same context.- Anthology ID:
- N18-2033
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 206–211
- Language:
- URL:
- https://aclanthology.org/N18-2033
- DOI:
- 10.18653/v1/N18-2033
- Cite (ACL):
- Maja Buljan, Sebastian Padó, and Jan Šnajder. 2018. Lexical Substitution for Evaluating Compositional Distributional Models. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 206–211, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Lexical Substitution for Evaluating Compositional Distributional Models (Buljan et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/N18-2033.pdf