All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing
Advait Deshmukh, Ashwin Umadi, Dananjay Srinivas, Maria Leonor Pacheco
Abstract
Due to their capacity to acquire world knowledge from large corpora, pre-trained language models (PLMs) are extensively used in ultra-fine entity typing tasks where the space of labels is extremely large. In this work, we explore the limitations of the knowledge acquired by PLMs by proposing a novel heuristic to approximate the pre-training distribution of entities when the pre-training data is unknown. Then, we systematically demonstrate that entity-typing approaches that rely solely on the parametric knowledge of PLMs struggle significantly with entities at the long tail of the pre-training distribution, and that knowledge-infused approaches can account for some of these shortcomings. Our findings suggest that we need to go beyond PLMs to produce solutions that perform well for infrequent entities.- Anthology ID:
- 2025.starsem-1.15
- Volume:
- Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Lea Frermann, Mark Stevenson
- Venue:
- *SEM
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 189–201
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.15/
- DOI:
- Cite (ACL):
- Advait Deshmukh, Ashwin Umadi, Dananjay Srinivas, and Maria Leonor Pacheco. 2025. All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing. In Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025), pages 189–201, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing (Deshmukh et al., *SEM 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.15.pdf