Detecting reported speech as a token classification task: an application to Classical Latin?

Agustin Dei

Detecting reported speech as a token classification task: an application to Classical Latin?

Abstract

This paper presents the first application of an automatic token-classification approach for detecting reported speech spans in Classical Latin using transformer-based neural architectures.Focusing on Seneca the Elder’s Declamatory Anthology, the study addresses the text’s highly polyphonic nature, resulting from theuse of reported speech. Instead of relying exclusively on sentence-level syntactic information, the proposed approach treats reported speech detection as a token-level sequence labeling problem. This enables the identification of reported speech spans extending across multiple sentences. We fine-tune three Latin neural language models —LatinBERT, LaBERTa, and PhilBERTa— for binary token-level classification and conduct experiments both with and without punctuation. The results show that RoBERTa-based models effectively identify reported speech, with LaBERTa achieving the best performance (F1 scores above 0.90).

Anthology ID:: 2026.latechclfl-1.24
Volume:: Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
Venues:: LaTeCH-CLfL | WS
SIG:: SIGHUM
Publisher:: Association for Computational Linguistics
Note:
Pages:: 251–256
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.latechclfl-1.24/
DOI:
Bibkey:
Cite (ACL):: Agustin Dei. 2026. Detecting reported speech as a token classification task: an application to Classical Latin?. In Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026, pages 251–256, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Detecting reported speech as a token classification task: an application to Classical Latin? (Dei, LaTeCH-CLfL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.latechclfl-1.24.pdf
Supplementarymaterial:: 2026.latechclfl-1.24.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Fix data