An Enhanced Training-Free Pipeline for Entity Recognition and Linking: A Low-Resource Case Study – 20-th Century Historical Medical Texts

Phu-Vinh Nguyen, Vera Danilova


Abstract
Entity linking in biomedicine typically relies on large annotated corpora and supervised methods, which often fail in out-of-distribution settings. Historical medical texts are rich in biomedical terms but pose unique challenges: terminology has changed, some concepts are obsolete, and stylistic differences from modern journals prevent off-the-shelf models fine-tuned on contemporary datasets from aligning historical terms with current ontologies. Training-free methods based on LLMs offer a solution by linking historical terms to modern concepts and inferring their meaning from context. In this paper, we evaluate a state-of-the-art training-free entity linking method on historical medical texts and propose an improved pipeline—end-to-end entity extraction and linking with confidence estimation. We also assess performance on modern benchmarks to check whether the gains generalize to other domains and show their superior performance in most cases. We report an analysis of the findings. The code and curated dataset for historical medical entity linking are available on GitHub.
Anthology ID:
2026.healing-1.8
Volume:
Proceedings of the 1st Workshop on Linguistic Analysis for Health (HeaLing 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Danilova, Murathan Kurfalı, Ylva Söderfeldt, Julia Reed, Andrew Burchell
Venues:
HeaLing | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
94–104
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.healing-1.8/
DOI:
Bibkey:
Cite (ACL):
Phu-Vinh Nguyen and Vera Danilova. 2026. An Enhanced Training-Free Pipeline for Entity Recognition and Linking: A Low-Resource Case Study – 20-th Century Historical Medical Texts. In Proceedings of the 1st Workshop on Linguistic Analysis for Health (HeaLing 2026), pages 94–104, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
An Enhanced Training-Free Pipeline for Entity Recognition and Linking: A Low-Resource Case Study – 20-th Century Historical Medical Texts (Nguyen & Danilova, HeaLing 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.healing-1.8.pdf