Automatic Word Association Norms (AWAN)

Jorge Reyes-Magaña, Gerardo Sierra Martínez, Gemma Bel-Enguix, Helena Gomez-Adorno


Abstract
Word Association Norms (WAN) are collections that present stimuli words and the set of their associated responses. The corpus is widely used in diverse areas of expertise. In order to reduce the effort to have a good quality resource that can be reproduced in many languages with minimum sources, a methodology to build Automatic Word Association Norms is proposed (AWAN). The methodology has an input of two simple elements: a) dictionary, and b) pre-processed Word Embeddings. This new kind of WAN is evaluated in two ways: i) learning word embeddings based on the node2vec algorithm and comparing them with human annotated benchmarks, and ii) performing a lexical search for a reverse dictionary. Both evaluations are done in a weighted graph with the AWAN lexical elements. The results showed that the methodology produces good quality AWANs.
Anthology ID:
2020.cogalex-1.17
Volume:
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon
Month:
December
Year:
2020
Address:
Online
Venue:
CogALex
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–153
Language:
URL:
https://aclanthology.org/2020.cogalex-1.17
DOI:
Bibkey:
Cite (ACL):
Jorge Reyes-Magaña, Gerardo Sierra Martínez, Gemma Bel-Enguix, and Helena Gomez-Adorno. 2020. Automatic Word Association Norms (AWAN). In Proceedings of the Workshop on the Cognitive Aspects of the Lexicon, pages 142–153, Online. Association for Computational Linguistics.
Cite (Informal):
Automatic Word Association Norms (AWAN) (Reyes-Magaña et al., CogALex 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.cogalex-1.17.pdf