2019
pdf
abs
Embedding Lexical Features via Tensor Decomposition for Small Sample Humor Recognition
Zhenjie Zhao
|
Andrew Cattle
|
Evangelos Papalexakis
|
Xiaojuan Ma
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
We propose a novel tensor embedding method that can effectively extract lexical features for humor recognition. Specifically, we use word-word co-occurrence to encode the contextual content of documents, and then decompose the tensor to get corresponding vector representations. We show that this simple method can capture features of lexical humor effectively for continuous humor recognition. In particular, we achieve a distance of 0.887 on a global humor ranking task, comparable to the top performing systems from SemEval 2017 Task 6B (Potash et al., 2017) but without the need for any external training corpus. In addition, we further show that this approach is also beneficial for small sample humor recognition tasks through a semi-supervised label propagation procedure, which achieves about 0.7 accuracy on the 16000 One-Liners (Mihalcea and Strapparava, 2005) and Pun of the Day (Yang et al., 2015) humour classification datasets using only 10% of known labels.
2018
pdf
abs
Recognizing Humour using Word Associations and Humour Anchor Extraction
Andrew Cattle
|
Xiaojuan Ma
Proceedings of the 27th International Conference on Computational Linguistics
This paper attempts to marry the interpretability of statistical machine learning approaches with the more robust models of joke structure and joke semantics capable of being learned by neural models. Specifically, we explore the use of semantic relatedness features based on word associations, rather than the more common Word2Vec similarity, on a binary humour identification task and identify several factors that make word associations a better fit for humour. We also explore the effects of using joke structure, in the form of humour anchors (Yang et al., 2015), for improving the performance of semantic features and show that, while an intriguing idea, humour anchors contain several pitfalls that can hurt performance.
2017
pdf
abs
SRHR at SemEval-2017 Task 6: Word Associations for Humour Recognition
Andrew Cattle
|
Xiaojuan Ma
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
This paper explores the role of semantic relatedness features, such as word associations, in humour recognition. Specifically, we examine the task of inferring pairwise humour judgments in Twitter hashtag wars. We examine a variety of word association features derived from University of Southern Florida Free Association Norms (USF) and the Edinburgh Associative Thesaurus (EAT) and find that word association-based features outperform Word2Vec similarity, a popular semantic relatedness measure. Our system achieves an accuracy of 56.42% using a combination of unigram perplexity, bigram perplexity, EAT difference (tweet-avg), USF forward (max), EAT difference (word-avg), USF difference (word-avg), EAT forward (min), USF difference (tweet-max), and EAT backward (min).
pdf
abs
Predicting Word Association Strengths
Andrew Cattle
|
Xiaojuan Ma
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
This paper looks at the task of predicting word association strengths across three datasets; WordNet Evocation (Boyd-Graber et al., 2006), University of Southern Florida Free Association norms (Nelson et al., 2004), and Edinburgh Associative Thesaurus (Kiss et al., 1973). We achieve results of r=0.357 and p=0.379, r=0.344 and p=0.300, and r=0.292 and p=0.363, respectively. We find Word2Vec (Mikolov et al., 2013) and GloVe (Pennington et al., 2014) cosine similarities, as well as vector offsets, to be the highest performing features. Furthermore, we examine the usefulness of Gaussian embeddings (Vilnis and McCallum, 2014) for predicting word association strength, the first work to do so.
2016
pdf
abs
Effects of Semantic Relatedness between Setups and Punchlines in Twitter Hashtag Games
Andrew Cattle
|
Xiaojuan Ma
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)
This paper explores humour recognition for Twitter-based hashtag games. Given their popularity, frequency, and relatively formulaic nature, these games make a good target for computational humour research and can leverage Twitter likes and retweets as humour judgments. In this work, we use pair-wise relative humour judgments to examine several measures of semantic relatedness between setups and punchlines on a hashtag game corpus we collected and annotated. Results show that perplexity, Normalized Google Distance, and free-word association-based features are all useful in identifying “funnier” hashtag game responses. In fact, we provide empirical evidence that funnier punchlines tend to be more obscure, although more obscure punchlines are not necessarily rated funnier. Furthermore, the asymmetric nature of free-word association features allows us to see that while punchlines should be harder to predict given a setup, they should also be relatively easy to understand in context.
2013
pdf
Applying Graph-based Keyword Extraction to Document Retrieval
Youngsam Kim
|
Munhyong Kim
|
Andrew Cattle
|
Julia Otmakhova
|
Suzi Park
|
Hyopil Shin
Proceedings of the Sixth International Joint Conference on Natural Language Processing
2012
pdf
Annotation Scheme for Constructing Sentiment Corpus in Korean
Hyopil Shin
|
Munhyong Kim
|
Hayeon Jang
|
Andrew Cattle
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation