Artificial Relationships in Fiction: A Dataset for Advancing NLP in Literary Domains

Despina Christou, Grigorios Tsoumakas


Abstract
Relation extraction (RE) in fiction presents unique NLP challenges due to implicit, narrative-driven relationships. Unlike factual texts, fiction weaves complex connections, yet existing RE datasets focus on non-fiction. To address this, we introduce Artificial Relationships in Fiction (ARF), a synthetically annotated dataset for literary RE. Built from diverse Project Gutenberg fiction, ARF considers author demographics, publication periods, and themes. We curated an ontology for fiction-specific entities and relations, and using GPT-4o, generated artificial relationships to capture narrative complexity. Our analysis demonstrates its value for finetuning RE models and advancing computational literary studies. By bridging a critical RE gap, ARF enables deeper exploration of fictional relationships, enriching NLP research at the intersection of storytelling and AI-driven literary analysis.
Anthology ID:
2025.latechclfl-1.13
Volume:
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Anna Kazantseva, Stan Szpakowicz, Stefania Degaetano-Ortlieb, Yuri Bizzoni, Janis Pagel
Venues:
LaTeCHCLfL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
130–147
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.latechclfl-1.13/
DOI:
Bibkey:
Cite (ACL):
Despina Christou and Grigorios Tsoumakas. 2025. Artificial Relationships in Fiction: A Dataset for Advancing NLP in Literary Domains. In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), pages 130–147, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Artificial Relationships in Fiction: A Dataset for Advancing NLP in Literary Domains (Christou & Tsoumakas, LaTeCHCLfL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.latechclfl-1.13.pdf