Despina Christou
2025
Artificial Relationships in Fiction: A Dataset for Advancing NLP in Literary Domains
Despina Christou
|
Grigorios Tsoumakas
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Relation extraction (RE) in fiction presents unique NLP challenges due to implicit, narrative-driven relationships. Unlike factual texts, fiction weaves complex connections, yet existing RE datasets focus on non-fiction. To address this, we introduce Artificial Relationships in Fiction (ARF), a synthetically annotated dataset for literary RE. Built from diverse Project Gutenberg fiction, ARF considers author demographics, publication periods, and themes. We curated an ontology for fiction-specific entities and relations, and using GPT-4o, generated artificial relationships to capture narrative complexity. Our analysis demonstrates its value for finetuning RE models and advancing computational literary studies. By bridging a critical RE gap, ARF enables deeper exploration of fictional relationships, enriching NLP research at the intersection of storytelling and AI-driven literary analysis.
2021
The concept of nation in nineteenth-century Greek fiction through computational literary analysis
Fotini Koidaki
|
Despina Christou
|
Katerina Tiktopoulou
|
Grigorios Tsoumakas
Proceedings of the Workshop on Natural Language Processing for Digital Humanities
How the construction of national consciousness may be captured in the literary production of a whole century? What can the macro-analysis of the 19th-century prose fiction reveal about the formation of the concept of the nation-state of Greece? How could the concept of nationality be detected in literary writing and then interpreted? These are the questions addressed by the research that is published in this paper and which focuses on exploring how the concept of the nation is figured and shaped in 19th-century Greek prose fiction. This paper proposes a methodological approach that combines well-known text mining techniques with computational close reading methods in order to retrieve the nation-related passages and to analyze them linguistically and semantically. The main objective of the paper at hand is to map the frequency and the phraseology of the nation-related references, as well as to explore the phrase patterns in relation to the topic modeling results.