A Herd of Language Models Makes a Better Zero-shot Annotator for Clinical Named Entity Recognition

Seiji Shimizu, Shoko Wakamiya, Eiji Aramaki


Abstract
Clinical named entity recognition (NER) remains difficult to scale due to the high cost of manual annotation. Although large language models (LLMs) enable zero-shot annotation, their performance on clinical NER is still limited. To this end, we improve the annotation quality by aggregating annotations from *a herd of diverse LLMs*, including general-purpose, medically adapted, and NER-specialized models. A key challenge in this multi-LLM setting is effectively leveraging entities extracted by only a minority of models: although they account for a substantial portion of true positives, they are heavily intermixed with noise. To address this, we introduce **MARY**, a label-modeling method for **M**ulti-LLM **A**nnotation using **R**epresentation learning to capture contextual similarit**Y**. During aggregation, MARY selectively incorporates minority-extracted entities whose contexts are similar to those of majority-extracted entities, yielding more reliable and comprehensive annotations. Experimental results show that MARY improves the average F1 score by 8.6% over vanilla zero-shot baselines while reducing annotation costs.
Anthology ID:
2026.findings-acl.599
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12327–12344
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.599/
DOI:
Bibkey:
Cite (ACL):
Seiji Shimizu, Shoko Wakamiya, and Eiji Aramaki. 2026. A Herd of Language Models Makes a Better Zero-shot Annotator for Clinical Named Entity Recognition. In Findings of the Association for Computational Linguistics: ACL 2026, pages 12327–12344, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
A Herd of Language Models Makes a Better Zero-shot Annotator for Clinical Named Entity Recognition (Shimizu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.599.pdf
Checklist:
 2026.findings-acl.599.checklist.pdf