Michaela Regneri

2025

Automating Violence Detection and Categorization from Ancient Texts
Alhassan Abdelhalim | Michaela Regneri
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)

Violence descriptions in literature offer valuable insights for a wide range of research in the humanities. For historians, depictions of violence are of special interest for analyzing the societal dynamics surrounding large wars and individual conflicts of influential people. Harvesting data for violence research manually is laborious and time-consuming. This study is the first one to evaluate the effectiveness of large language models (LLMs) in identifying violence in ancient texts and categorizing it across multiple dimensions. Our experiments identify LLMs as a valuable tool to scale up the accurate analysis of historical texts and show the effect of fine-tuning and data augmentation, yielding an F1-score of up to 0.93 for violence detection and 0.86 for fine-grained violence categorization.

2024

pdf bib abs

Detecting Conceptual Abstraction in LLMs
Michaela Regneri | Alhassan Abdelhalim | Soeren Laue
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We show a novel approach to detecting noun abstraction within a large language model (LLM). Starting from a psychologically motivated set of noun pairs in taxonomic relationships, we instantiate surface patterns indicating hypernymy and analyze the attention matrices produced by BERT. We compare the results to two sets of counterfactuals and show that we can detect hypernymy in the abstraction mechanism, which cannot solely be related to the distributional similarity of noun pairs. Our findings are a first step towards the explainability of conceptual abstraction in LLMs.

2020

pdf bib abs

Images and Imagination: Automated Analysis of Priming Effects Related to Autism Spectrum Disorder and Developmental Language Disorder
Michaela Regneri | Diane King | Fahreen Walji | Olympia Palikara
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Different aspects of language processing have been shown to be sensitive to priming but the findings of studies examining priming effects in adolescents with Autism Spectrum Disorder (ASD) and Developmental Language Disorder (DLD) have been inconclusive. We present a study analysing visual and implicit semantic priming in adolescents with ASD and DLD. Based on a dataset of fictional and script-like narratives, we evaluate how often and how extensively, content of two different priming sources is used by the participants. The first priming source was visual, consisting of images shown to the participants to assist them with their storytelling. The second priming source originated from commonsense knowledge, using crowdsourced data containing prototypical script elements. Our results show that individuals with ASD are less sensitive to both types of priming, but show typical usage of primed cues when they use them at all. In contrast, children with DLD show mostly average priming sensitivity, but exhibit an over-proportional use of the priming cues.

2016

pdf bib

pdf bib

Automated Discourse Analysis of Narrations by Adolescents with Autistic Spectrum Disorder
Michaela Regneri | Diane King
Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning

2014

pdf bib abs

Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
Michaela Regneri | Rui Wang | Manfred Pinkal
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Paraphrases and paraphrasing algorithms have been found of great importance in various natural language processing tasks. While most paraphrase extraction approaches extract equivalent sentences, sentences are an inconvenient unit for further processing, because they are too specific, and often not exact paraphrases. Paraphrase fragment extraction is a technique that post-processes sentential paraphrases and prunes them to more convenient phrase-level units. We present a new approach that uses semantic roles to extract paraphrase fragments from sentence pairs that share semantic content to varying degrees, including full paraphrases. In contrast to previous systems, the use of semantic parses allows for extracting paraphrases with high wording variance and different syntactic categories. The approach is tested on four different input corpora and compared to two previous systems for extracting paraphrase fragments. Our system finds three times as many good paraphrase fragments per sentence pair as the baselines, and at the same time outputs 30% fewer unrelated fragment pairs.

pdf bib

lex4all: A language-independent tool for building and evaluating pronunciation lexicons for small-vocabulary speech recognition
Anjana Vakil | Max Paulus | Alexis Palmer | Michaela Regneri
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib

SeedLing: Building and Using a Seed corpus for the Human Language Project
Guy Emerson | Liling Tan | Susanne Fertmann | Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib

Short-Term Projects, Long-Term Benefits: Four Student NLP Projects for Low-Resource Languages
Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

2013

pdf bib abs

Recent work has shown that the integration of visual information into text-based models can substantially improve model predictions, but so far only visual information extracted from static images has been used. In this paper, we consider the problem of grounding sentences describing actions in visual information extracted from videos. We present a general purpose corpus that aligns high quality videos with multiple natural language descriptions of the actions portrayed in the videos, together with an annotation of how similar the action descriptions are to each other. Experimental results demonstrate that a text-based model of similarity between actions improves substantially when combined with visual information from videos depicting the described actions.