Truncation Sampling as Language Model Desmoothing

John Hewitt; Christopher D. Manning; Percy Liang

doi:10.18653/v1/2022.findings-emnlp.249

Truncation Sampling as Language Model Desmoothing

John Hewitt, Christopher Manning, Percy Liang

Abstract

Long samples of text from neural language models can be of poor quality. Truncation sampling algorithms–like top-p or top-k—address this by setting some words’ probabilities to zero at each step. This work investigates why these methods are important, and how to improve them. We propose thinking of a neural language model as a mixture of a true distribution and a smoothing distribution that avoids infinite perplexity. In this light, truncation algorithms aim to perform desmoothing, estimating a subset of the support of the true distribution. Finding a good subset is crucial: we show that top-p unnecessarily truncates high-probability words, for example causing it to truncate all words but Trump for a document that starts with Donald. We introduce eta-sampling, which truncates words below an entropy-dependent probability threshold. Compared to previous algorithms, our eta-sampling generates more plausible long documents according to humans, is better at breaking out of repetition, and behaves more reasonably on a battery of test distributions.

Anthology ID:: 2022.findings-emnlp.249
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3414–3427
Language:
URL:: https://aclanthology.org/2022.findings-emnlp.249
DOI:: 10.18653/v1/2022.findings-emnlp.249
Bibkey:
Cite (ACL):: John Hewitt, Christopher Manning, and Percy Liang. 2022. Truncation Sampling as Language Model Desmoothing. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3414–3427, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Truncation Sampling as Language Model Desmoothing (Hewitt et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-emnlp.249.pdf

PDF Search