Junhao Lu
2025
Handling Missing Entities in Zero-Shot Named Entity Recognition: Integrated Recall and Retrieval Augmentation
Ruichu Cai
|
Junhao Lu
|
Zhongjie Chen
|
Boyan Xu
|
Zhifeng Hao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Zero-shot Named Entity Recognition (ZS-NER) aims to recognize entities in unseen domains without specific annotated data. A key challenge is handling missing entities while ensuring accurate type recognition, hindered by: 1) the pre-training assumption that each entity has a single type, overlooking diversity, and 2) insufficient contextual knowledge for type reasoning. To address this, we propose IRRA (Integrated Recall and Retrieval Augmentation), a novel two-stage framework leveraging large language model techniques. In the Recall Augmented Entity Extracting stage, we built a perturbed dataset to induce the model to exhibit missing or erroneous extracted entities. Based on this, we trained an enhanced model to correct these errors. This approach can improve the ZS-NER’s recall rate. In the Retrieval Augmented Type Correcting stage, we employ Retrieval-Augmented Generation techniques to locate entity-related unannotated contexts, with the additional contextual information significantly improving the accuracy of type correcting. Extensive evaluations demonstrate the state-of-the-art performance of our IRRA, with significant improvements in zero-shot cross-domain settings validated through both auto-evaluated metrics and analysis. Our implementation will be open-sourced athttps://github.com/DMIRLAB-Group/IRRA.