Prompting the Past: Exploring Zero-Shot Learning for Named Entity Recognition in Historical Texts Using Prompt-Answering LLMs

Crina Tudor, Beata Megyesi, Robert Östling


Abstract
This paper investigates the application of prompt-answering Large Language Models (LLMs) for the task of Named Entity Recognition (NER) in historical texts. Historical NER presents unique challenges due to language change through time, spelling variation, limited availability of digitized data (and, in particular, labeled data), and errors introduced by Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) processes. Leveraging the zero-shot capabilities of prompt-answering LLMs, we address these challenges by prompting the model to extract entities such as persons, locations, organizations, and dates from historical documents. We then conduct an extensive error analysis of the model output in order to identify and address potential weaknesses in the entity recognition process. The results show that, while such models display ability for extracting named entities, their overall performance is lackluster. Our analysis reveals that model performance is significantly affected by hallucinations in the model output, as well as by challenges imposed by the evaluation of NER output.
Anthology ID:
2025.latechclfl-1.19
Volume:
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Anna Kazantseva, Stan Szpakowicz, Stefania Degaetano-Ortlieb, Yuri Bizzoni, Janis Pagel
Venues:
LaTeCHCLfL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
216–226
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.latechclfl-1.19/
DOI:
Bibkey:
Cite (ACL):
Crina Tudor, Beata Megyesi, and Robert Östling. 2025. Prompting the Past: Exploring Zero-Shot Learning for Named Entity Recognition in Historical Texts Using Prompt-Answering LLMs. In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), pages 216–226, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Prompting the Past: Exploring Zero-Shot Learning for Named Entity Recognition in Historical Texts Using Prompt-Answering LLMs (Tudor et al., LaTeCHCLfL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.latechclfl-1.19.pdf