Joe Toplyn
2026
Navigating the Joke Space: Towards Automated Originality Assessment of AI-Generated Humor
Ori Amir | Huyen Ngo | Joe Toplyn | Kevin Hickerson
Proceedings of the 2nd Workshop on Computational Humor (CHum 2026)
Ori Amir | Huyen Ngo | Joe Toplyn | Kevin Hickerson
Proceedings of the 2nd Workshop on Computational Humor (CHum 2026)
This study validates automated, corpus-based methods for quantifying joke originality using “topic handles” — key nouns or noun phrases capturing a joke’s script opposition and logical mechanism (per the General Theory of Verbal Humor). Using a reference corpus of one million jokes in English from Reddit, we compute Pointwise Mutual Information (PMI) in three variants (raw co-occurrence, semantic-cluster smoothing, and word-decomposition) and two embedding-based measures (handle-level conceptual distance and full-text corpus novelty via Sentence-BERT). We evaluate these measures on 400 LLM-generated jokes (200 each from GPT-4o and GPT-5.4) and 80 jokes from the Witscript-powered JEST benchmark, rated by three professional comedians for originality and funniness. Corpus novelty and concept distance between the most semantically distant handle pair both correlated significantly with human originality ratings (𝜌 = .37); PMI-based measures showed weaker but significant associations (𝜌 = .23–.25) on the most original handle pair. A Lasso-based composite of the three strongest predictors achieved 𝜌 = .40 (cross-validated), capturing 82% of the theoretically predictable variance given inter-rater agreement. These results demonstrate that handle-based PMI and semantic novelty metrics offer practical, quantitative tools for assessing originality in AI-generated humor, advancing objective evaluation of computational creativity.
2025
Can AI Make Us Laugh? Comparing Jokes Generated by Witscript and a Human Expert
Joe Toplyn | Ori Amir
Proceedings of the 1st Workshop on Computational Humor (CHum)
Joe Toplyn | Ori Amir
Proceedings of the 1st Workshop on Computational Humor (CHum)
This study compares the funniness of AI-generated jokes and those written by a professional human joke writer, using audience laughter as a direct measure. Prior research has typically relied on numerical ratings, which have limitations. Our findings show that AI-generated jokes elicited as much laughter as human-crafted ones, indicating that advanced AI joke generators can now produce original jokes on par with those of a professional human comedy writer.