Speaking on Their Behalf: Detecting Indirect Speech in Historical Danish and Norwegian Texts

Ali Al-Laith, Alexander Conroy, Kirstine Degn, Jens Bjerring-Hansen, Daniel Hershcovich


Abstract
Indirect speech is a fundamental yet understudied form of reported speech that plays a crucial role in literary texts and communication. While direct speech detection has received significant attention in computational linguistics, the automatic identification of indirect speech remains a challenge due to its nuanced linguistic structure and contextual dependencies. This paper focuses on the detection of indirect speech in late 19th-century Scandinavian literature, where its presence has been linked to shifting aesthetic ideals. We present an annotated dataset of 150 segments, each randomly selected from 150 different novels, designed to capture indirect speech in Danish and Norwegian literature. We evaluate four pre-trained language models for classifying indirect speech, with results showing that a Danish Foundation Model (DFM Large), trained on extensive Danish data, has the highest performance. Finally, we conduct a classifier-assisted quantitative corpus analysis and find that the prevalence of indirect speech exhibits fluctuations over time.
Anthology ID:
2026.latechclfl-1.15
Volume:
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
Venues:
LaTeCH-CLfL | WS
SIG:
SIGHUM
Publisher:
Association for Computational Linguistics
Note:
Pages:
157–163
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.latechclfl-1.15/
DOI:
Bibkey:
Cite (ACL):
Ali Al-Laith, Alexander Conroy, Kirstine Degn, Jens Bjerring-Hansen, and Daniel Hershcovich. 2026. Speaking on Their Behalf: Detecting Indirect Speech in Historical Danish and Norwegian Texts. In Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026, pages 157–163, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Speaking on Their Behalf: Detecting Indirect Speech in Historical Danish and Norwegian Texts (Al-Laith et al., LaTeCH-CLfL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.latechclfl-1.15.pdf
Supplementarymaterial:
 2026.latechclfl-1.15.SupplementaryMaterial.zip
Supplementarymaterial:
 2026.latechclfl-1.15.SupplementaryMaterial.txt