Svenja Guhr
2026
Between Whispers and Screams: Loudness Standard Deviation as a Proxy for Explicit Content Detection in US Romance Novels
Svenja Guhr
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
Svenja Guhr
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
This study proposes and tests loudness standard deviation (SD) of fictional sound events as an acoustically grounded proxy for detecting explicit content in romance fiction. Working with a subcorpus of novels from the Harlequin Men Made in America series, scenes were annotated for character and ambient sound with loudness levels. Additionally, the scenes were annotated on a ternary severity scale with two content advisory categories drawn from the PG-story taxonomy, Sex & Nudity and Violence & Scariness (CITATION), and tested whether within-scene loudness SD of character and ambient sound correlates with either category. Loudness standard deviation analyses of character and ambient sounds in scenes featuring explicit content reveal that erotic scenes are acoustically marked by significantly higher variability in character-produced sounds, reflecting the dynamic range from whispered dialogue to vocalized arousal, while no significant correlation was found between high ambient sound loudness SD and scenes of elevated Violence & Scariness.
2025
Rethinking Scene Segmentation. Advancing Automated Detection of Scene Changes in Literary Texts
Svenja Guhr | Huijun Mao | Fengyi Lin
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Svenja Guhr | Huijun Mao | Fengyi Lin
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Automated scene segmentation is an ongoing challenge in computational literary studies (CLS) to approach literary texts by analyzing comparable units. In this paper, we present our approach (work in progress) to text segmentation using a classifier that identifies the position of a scene change in English-language fiction. By manually annotating novels from a 20th-century US-English romance fiction corpus, we prepared training data for fine-tuning transformer models, yielding promising preliminary results for improving automated text segmentation in CLS.
2022
Exploring Text Recombination for Automatic Narrative Level Detection
Nils Reiter | Judith Sieker | Svenja Guhr | Evelyn Gius | Sina Zarrieß
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Nils Reiter | Judith Sieker | Svenja Guhr | Evelyn Gius | Sina Zarrieß
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Automatizing the process of understanding the global narrative structure of long texts and stories is still a major challenge for state-of-the-art natural language understanding systems, particularly because annotated data is scarce and existing annotation workflows do not scale well to the annotation of complex narrative phenomena. In this work, we focus on the identification of narrative levels in texts corresponding to stories that are embedded in stories. Lacking sufficient pre-annotated training data, we explore a solution to deal with data scarcity that is common in machine learning: the automatic augmentation of an existing small data set of annotated samples with the help of data synthesis. We present a workflow for narrative level detection, that includes the operationalization of the task, a model, and a data augmentation protocol for automatically generating narrative texts annotated with breaks between narrative levels. Our experiments suggest that narrative levels in long text constitute a challenging phenomenon for state-of-the-art NLP models, but generating training data synthetically does improve the prediction results considerably.