Milton King

This is an internal, incomplete preview of a proposed change to the ACL Anthology. For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes. Do not treat this content as an official publication.

2025

pdf bib abs
STFXNLP at SemEval-2025 Task 11 Track A: Neural Network, Schema, and Next Word Prediction-based Approaches to Perceived Emotion Detection
Noah Murrant | Samantha Brooks | Milton King
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

In this work, we discuss our models that were applied to the SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection (Muhammad et al., 2025b). We focused on the English data set of track A, which involves determining what emotions the reader of a snippet of text is feeling. We applied three different types of models that vary in their approaches and reported our findings on the task’s test set. We found that the performance of our models differed from each other, but neither of our models outperformed the task’s baseline model.

2024

pdf bib abs
Exploring BERT-Based Classification Models for Detecting Phobia Subtypes: A Novel Tweet Dataset and Comparative Analysis
Anik Das | Milton King | James Alexander Hughes
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Phobias, characterized by irrational fears of specific objects or situations, can profoundly affect an individual’s quality of life. This research presents a comprehensive investigation into phobia classification, where we propose a novel dataset of 811,569 English tweets from user timelines spanning 102 phobia subtypes over six months, including 47,614 self-diagnosed phobia users. BERT models were leveraged to differentiate non-phobia from phobia users and classify them into 65 specific phobia subtypes. The study produced promising results, with the highest f1-score of 78.44% in binary classification (phobic user or not phobic user) and 24.01% in a multi-class classification (detecting the specific phobia subtype of a user). This research provides insights into people with phobias on social media and emphasizes the capacity of natural language processing and machine learning to automate the evaluation and support of mental health.

pdf bib abs
Sense of the Day: Short Timeframe Temporal-Aware Word Sense Disambiguation
Yuchen Wei | Milton King
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The predominant sense of a lemma can vary based on the timeframe (years, decades, centuries) that the text was written. In our work, we explore the predominant sense of shorter timeframes (days, months, seasons, etc.) and find that different short timeframes can have different predominant senses from each other and from the predominant sense of a corpus. Leveraging the predominant sense and sense distribution of a short timeframe, we design short timeframe temporal-aware word sense disambiguation (WSD) models that outperform a temporal agnostic model. Likewise, author-aware WSD models tend to outperform author agnostic models, therefore we augment our temporal-aware models to leverage knowledge of author-level predominant senses and sense distributions to create temporal and author-aware WSD models. In addition to this, we found that considering recent usages of a lemma by the same author can assist a WSD model. Our approach requires the use of only a small amount of text from authors and timeframes.

pdf bib abs
StFX-NLP at SemEval-2024 Task 9: BRAINTEASER: Three Unsupervised Riddle-Solvers
Ethan Heavey | James Hughes | Milton King
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

In this paper, we explore three unsupervised learning models that we applied to Task 9: BRAINTEASER of SemEval 2024. Two of these models incorporate word sense disambiguation and part-of-speech tagging, specifically leveraging SensEmBERT and the Stanford log-linear part-of-speech tagger. Our third model relies on a more traditional language modelling approach. The best performing model, a bag-of-words model leveraging word sense disambiguation and part-of-speech tagging, secured the 10th spot out of 11 places on both the sentence puzzle and word puzzle subtasks.

2023

pdf bib abs
StFX-NLP at SemEval-2023 Task 4: Unsupervised and Supervised Approaches to Detecting Human Values in Arguments
Ethan Heavey | Milton King | James Hughes
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we discuss our models applied to Task 4: Human Value Detection of SemEval 2023, which incorporated two different embedding techniques to interpret the data. Preliminary experiments were conducted to observe important word types. Subsequently, we explored an XGBoost model, an unsupervised learning model, and two Ensemble learning models were then explored. The best performing model, an ensemble model employing a soft voting technique, secured the 34th spot out of 39 teams, on a class imbalanced dataset. We explored the inclusion of different parts of the provided knowledge resource and found that considering only specific parts assisted our models.

pdf bib abs
StFX NLP at SemEval-2023 Task 1: Multimodal Encoding-based Methods for Visual Word Sense Disambiguation
Yuchen Wei | Milton King
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

SemEval-2023’s Task 1, Visual Word Sense Disambiguation, a task about text semantics and visual semantics, selecting an image from a list of candidates, that best exhibits a given target word in a small context. We tried several methods, including the image captioning method and CLIP methods, and submitted our predictions in the competition for this task. This paper describes the methods we used and their performance and provides an analysis and discussion of the performance.

2021

pdf bib abs
Now, It’s Personal : The Need for Personalized Word Sense Disambiguation
Milton King | Paul Cook
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Authors of text tend to predominantly use a single sense for a lemma that can differ among different authors. This might not be captured with an author-agnostic word sense disambiguation (WSD) model that was trained on multiple authors. Our work finds that WordNet’s first senses, the predominant senses of our dataset’s genre, and the predominant senses of an author can all be different and therefore, author-agnostic models could perform well over the entire dataset, but poorly on individual authors. In this work, we explore methods for personalizing WSD models by tailoring existing state-of-the-art models toward an individual by exploiting the author’s sense distributions. We propose a novel WSD dataset and show that personalizing a WSD system with knowledge of an author’s sense distributions or predominant senses can greatly increase its performance.

pdf bib abs
UNBNLP at SemEval-2021 Task 1: Predicting lexical complexity with masked language models and character-level encoders
Milton King | Ali Hakimi Parizi | Samin Fakharian | Paul Cook
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

In this paper, we present three supervised systems for English lexical complexity prediction of single and multiword expressions for SemEval-2021 Task 1. We explore the use of statistical baseline features, masked language models, and character-level encoders to predict the complexity of a target token in context. Our best system combines information from these three sources. The results indicate that information from masked language models and character-level encoders can be combined to improve lexical complexity prediction.

2020

pdf bib abs
Evaluating Approaches to Personalizing Language Models
Milton King | Paul Cook
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this work, we consider the problem of personalizing language models, that is, building language models that are tailored to the writing style of an individual. Because training language models requires a large amount of text, and individuals do not necessarily possess a large corpus of their writing that could be used for training, approaches to personalizing language models must be able to rely on only a small amount of text from any one user. In this work, we compare three approaches to personalizing a language model that was trained on a large background corpus using a relatively small amount of text from an individual user. We evaluate these approaches using perplexity, as well as two measures based on next word prediction for smartphone soft keyboards. Our results show that when only a small amount of user-specific text is available, an approach based on priming gives the most improvement, while when larger amounts of user-specific text are available, an approach based on language model interpolation performs best. We carry out further experiments to show that these approaches to personalization outperform language model adaptation based on demographic factors.

2019

pdf bib abs
UNBNLP at SemEval-2019 Task 5 and 6: Using Language Models to Detect Hate Speech and Offensive Language
Ali Hakimi Parizi | Milton King | Paul Cook
Proceedings of the 13th International Workshop on Semantic Evaluation

In this paper we apply a range of approaches to language modeling – including word-level n-gram and neural language models, and character-level neural language models – to the problem of detecting hate speech and offensive language. Our findings indicate that language models are able to capture knowledge of whether text is hateful or offensive. However, our findings also indicate that more conventional approaches to text classification often perform similarly or better.

2018

pdf bib abs
Leveraging distributed representations and lexico-syntactic fixedness for token-level prediction of the idiomaticity of English verb-noun combinations
Milton King | Paul Cook
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Verb-noun combinations (VNCs) - e.g., blow the whistle, hit the roof, and see stars - are a common type of English idiom that are ambiguous with literal usages. In this paper we propose and evaluate models for classifying VNC usages as idiomatic or literal, based on a variety of approaches to forming distributed representations. Our results show that a model based on averaging word embeddings performs on par with, or better than, a previously-proposed approach based on skip-thoughts. Idiomatic usages of VNCs are known to exhibit lexico-syntactic fixedness. We further incorporate this information into our models, demonstrating that this rich linguistic knowledge is complementary to the information carried by distributed representations.

pdf bib abs
UNBNLP at SemEval-2018 Task 10: Evaluating unsupervised approaches to capturing discriminative attributes
Milton King | Ali Hakimi Parizi | Paul Cook
Proceedings of the 12th International Workshop on Semantic Evaluation

In this paper we present three unsupervised models for capturing discriminative attributes based on information from word embeddings, WordNet, and sentence-level word co-occurrence frequency. We show that, of these approaches, the simple approach based on word co-occurrence performs best. We further consider supervised and unsupervised approaches to combining information from these models, but these approaches do not improve on the word co-occurrence model.

2017

pdf bib abs
Supervised and unsupervised approaches to measuring usage similarity
Milton King | Paul Cook
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications

Usage similarity (USim) is an approach to determining word meaning in context that does not rely on a sense inventory. Instead, pairs of usages of a target lemma are rated on a scale. In this paper we propose unsupervised approaches to USim based on embeddings for words, contexts, and sentences, and achieve state-of-the-art results over two USim datasets. We further consider supervised approaches to USim, and find that although they outperform unsupervised approaches, they are unable to generalize to lemmas that are unseen in the training data.

Milton King

Fixing paper assignments