Maxat Tezekbayev

2022

pdf bib abs
Speeding Up Entmax
Maxat Tezekbayev | Vassilina Nikoulina | Matthias Gallé | Zhenisbek Assylbekov
Findings of the Association for Computational Linguistics: NAACL 2022

Softmax is the de facto standard for normalizing logits in modern neural networks for language processing. However, by producing a dense probability distribution each token in the vocabulary has a nonzero chance of being selected at each generation step, leading to a variety of reported problems in text generation. 𝛼-entmax of Peters et al. (2019) solves this problem, but is unfortunately slower than softmax. In this paper, we propose an alternative to 𝛼-entmax, which keeps its virtuous characteristics, but is as fast as optimized softmax and achieves on par or better performance in machine translation task.

Co-authors

Zhenisbek Assylbekov 1
Matthias Gallé 1
Vassilina Nikoulina 1

Venues

findings1

Fix data