Mark A. Finlayson

Other people with similar names: Mark Finlayson


2026

Not all events in a narrative are created equal: some events are more important than others. Kernel events, a concept introduced in the field of narratology, are causally linked events that move the narrative forward, and cannot be removed without breaking the narrative’s logical coherence. While event detection and extraction tasks have been widely studied in natural language processing and information retrieval fields, the idea of kernel events has been largely unexplored. In this work, we introduce the first corpus and model for kernel event detection. Our contributions include: the refinement of the kernel event concept captured in detailed annotation guidelines grounded in narratological principles; an annotation study yielding a gold-standard dataset of kernel events in narrative texts; and a first-of-its-kind kernel event detection system. Annotation achieved an inter-annotator agreement of 0.61 Kappa, underscoring the reliability of the guidelines. Using these data, we trained several models in both fine-tuned and generative modes for kernel event detection, with a LoRA fine-tuned Llama3 achieving an F1 of 0.695. This work establishes a benchmark for kernel event detection, with potential applications in summarization, narrative similarity detection, and narrative understanding. We release our code and data for the benefit of other researchers.
Argumentation mining comprises several subtasks, among which stance classification focuses on identifying the standpoint expressed in an argumentative text toward a specific target topic. While arguments—especially about controversial topics—often appeal to emotions, most prior work has not systematically incorporated explicit, fine-grained emotion analysis to improve performance on this task. In particular, prior research on stance classification has predominantly utilized non-argumentative texts and has been restricted to specific domains or topics, limiting generalizability. We work on five datasets from diverse domains encompassing a range of controversial topics and present an approach for expanding the Bias-Corrected NRC Emotion Lexicon using DistilBERT embeddings, which we feed into a Neural Argumentative Stance Classification model. Our method systematically expands the emotion lexicon through contextualized embeddings to identify emotionally charged terms not previously captured in the lexicon. Our expanded NRC lexicon (eNRC) improves over the baseline across all five datasets (up to +6.2 percentage points in F1 score), outperforms the original NRC on four datasets (up to +3.0), and surpasses the LLM-based approach on nearly all corpora. We provide all resources—including eNRC, the adapted corpora, and model architecture—to enable other researchers to build upon our work
Job applicants are increasingly turning to generative AI to create or enhance their resumes, leading to challenges in fairness, integrity, and efficiency of modern recruitment processes. We present the first curated corpus of resumes annotated as to whether they are authentic, AI-enhanced, or fully AI-generated. The corpus is balanced across the three classes, comprising 420 resumes spanning five job descriptions in the Information Technology (IT) sector, with the authentic resumes anonymized. We establish strong baselines for this task using traditional and neural supervised machine learning approaches, including Logistic Regression, SVM, Random Forest, XGBoost, BERT, and Longformer. For the featurized approaches, we pair sparse TF-IDF (word/character n-grams) with style features capturing length, punctuation, casing, contractions, lexical diversity (type-token ratio [TTR], number of hapax legomena), n-gram uniqueness, readability indices, and sentiment. Our analysis reveals systematic differences between the classes: AI-generated text features shorter, more uniform sentences, and fewer contractions; AI-enhanced text has the highest uniqueness and TTR; and authentic text has the widest variance across all features. XGBoost is the best performing method, achieving 95.29% accuracy and an F1 of 0.953. We make the corpus available for other researchers to build upon our work. We also benchmark two leading off-the-shelf AI–text detectors on our 420-resume corpus. Despite strong reports in other domains, Originality attains only 55.7% accuracy overall (71/140 authentic, 81/140 AI-generated, 82/140 AI-enhanced correct), and Writer attains 25.0%, with the largest failures on AI-enhanced resumes, highlighting domain shift and cautioning against uncalibrated deployment.