Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations

Simone Conia; Roberto Navigli

doi:10.18653/v1/2020.coling-main.291

Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations

Abstract

To date, the most successful word, word sense, and concept modelling techniques have used large corpora and knowledge resources to produce dense vector representations that capture semantic similarities in a relatively low-dimensional space. Most current approaches, however, suffer from a monolingual bias, with their strength depending on the amount of data available across languages. In this paper we address this issue and propose Conception, a novel technique for building language-independent vector representations of concepts which places multilinguality at its core while retaining explicit relationships between concepts. Our approach results in high-coverage representations that outperform the state of the art in multilingual and cross-lingual Semantic Word Similarity and Word Sense Disambiguation, proving particularly robust on low-resource languages. Conception – its software and the complete set of representations – is available at https://github.com/SapienzaNLP/conception.

Anthology ID:: 2020.coling-main.291
Volume:: Proceedings of the 28th International Conference on Computational Linguistics
Month:: December
Year:: 2020
Address:: Barcelona, Spain (Online)
Editors:: Donia Scott, Nuria Bel, Chengqing Zong
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 3268–3284
Language:
URL:: https://aclanthology.org/2020.coling-main.291
DOI:: 10.18653/v1/2020.coling-main.291
Bibkey:
Cite (ACL):: Simone Conia and Roberto Navigli. 2020. Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3268–3284, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):: Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations (Conia & Navigli, COLING 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.291.pdf
Code: sapienzanlp/conception
Data: ConceptNet, Senseval-2, Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison

PDF Search Code