Where do LLMs currently stand on biomedical NER in both clean and noisy settings ?

Christophe Ye; Cassie S. Mitchell

Where do LLMs currently stand on biomedical NER in both clean and noisy settings ?

Abstract

Biomedical Named Entity Recognition (NER) consists of identifying and classifying important biomedical entities mentioned in text. Traditionally, biomedical NER has heavily relied on domain-specific pre-trained language models; particularly variant of BERT models. With the emergence of large language models (LLMs), some studies have evaluated their performance on biomedical NLP tasks. These studies consistently show that, despite their general capabilities, LLMs still fall short compared to specialized BERT-based models for biomedical NER. However, as LLMs continue to advance at a remarkable pace, natural questions arise: Are they still far behind, or are they starting to be competitive? In this study, we investigate the performance of recent LLMs across multiple biomedical NER datasets under both clean and noisy dataset conditions. Our findings reveal that LLMs are progressively closing the performance gap with BERT-based models and demonstrate particular strengths in low-data settings. Moreover, our results suggest that in-context learning with LLMs exhibits a notable degree of robustness to noise, making them a promising alternative in settings where labeled data is scarce or noisy.

Anthology ID:: 2026.findings-eacl.51
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 977–1001
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.51/
DOI:
Bibkey:
Cite (ACL):: Christophe Ye and Cassie S. Mitchell. 2026. Where do LLMs currently stand on biomedical NER in both clean and noisy settings ?. In Findings of the Association for Computational Linguistics: EACL 2026, pages 977–1001, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Where do LLMs currently stand on biomedical NER in both clean and noisy settings ? (Ye & Mitchell, Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.51.pdf
Checklist:: 2026.findings-eacl.51.checklist.pdf

PDF Cite Search Checklist Fix data