Pun Generation with Surprise

He He, Nanyun Peng, Percy Liang


Abstract
We tackle the problem of generating a pun sentence given a pair of homophones (e.g., “died” and “dyed”). Puns are by their very nature statistically anomalous and not amenable to most text generation methods that are supervised by a large corpus. In this paper, we propose an unsupervised approach to pun generation based on lots of raw (unhumorous) text and a surprisal principle. Specifically, we posit that in a pun sentence, there is a strong association between the pun word (e.g., “dyed”) and the distant context, but a strong association between the alternative word (e.g., “died”) and the immediate context. We instantiate the surprisal principle in two ways: (i) as a measure based on the ratio of probabilities given by a language model, and (ii) a retrieve-and-edit approach based on words suggested by a skip-gram model. Based on human evaluation, our retrieve-and-edit approach generates puns successfully 30% of the time, doubling the success rate of a neural generation baseline.
Anthology ID:
N19-1172
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1734–1744
Language:
URL:
https://aclanthology.org/N19-1172
DOI:
10.18653/v1/N19-1172
Bibkey:
Cite (ACL):
He He, Nanyun Peng, and Percy Liang. 2019. Pun Generation with Surprise. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1734–1744, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Pun Generation with Surprise (He et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/N19-1172.pdf
Video:
 https://vimeo.com/359670150
Code
 hhexiy/pungen +  additional community code
Data
BookCorpus