An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters

Matthieu Labeau, Alexandre Allauzen


Abstract
Noise Contrastive Estimation (NCE) is a learning procedure that is regularly used to train neural language models, since it avoids the computational bottleneck caused by the output softmax. In this paper, we attempt to explain some of the weaknesses of this objective function, and to draw directions for further developments. Experiments on a small task show the issues raised by an unigram noise distribution, and that a context dependent noise distribution, such as the bigram distribution, can solve these issues and provide stable and data-efficient learning.
Anthology ID:
E17-2003
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15–20
Language:
URL:
https://aclanthology.org/E17-2003
DOI:
Bibkey:
Cite (ACL):
Matthieu Labeau and Alexandre Allauzen. 2017. An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 15–20, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
An experimental analysis of Noise-Contrastive Estimation: the noise distribution matters (Labeau & Allauzen, EACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/E17-2003.pdf