Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models
Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner, Lukas Kinder, Tobias Käfer
Abstract
Entity disambiguation (ED) links ambiguous mentions in text to entries in a knowledge base and is a core task in entity linking systems. While pretrained decoder-only language models (DLMs) offer strong generalization capabilities, their effective use in ED has been restricted due to sensitivity to candidate order, susceptibility to hallucinated outputs, and potential dataset leakage. We introduce Agnus a zero-shot ED framework that addresses these challenges through three core innovations: (1) order-invariant candidate encoding via shared positional embeddings and modified autoregressive attention masking, which eliminates bias on input ordering; (2) constrained decoding that ensures outputs are restricted to valid candidates, effectively preventing hallucinations; and (3) synthetic dataset creation approach as a diagnostic tool for data contamination detection and mitigation. Agnus eliminates up to 15.2% of F1 variability caused by candidate permutations, delivering consistent and order-robust predictions previously unattainable with autoregressive architectures. In our experiments, Agnus achieves state-of-the-art performance on four standard ED benchmarks, surpassing prior zero-shot approaches by an average 3.7% using small language models. We release code, data including candidate sets, and a synthetic benchmark to support reproducibility and controlled evaluation.- Anthology ID:
- 2025.ijcnlp-long.174
- Volume:
- Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
- Month:
- December
- Year:
- 2025
- Address:
- Mumbai, India
- Editors:
- Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
- Venues:
- IJCNLP | AACL
- SIG:
- Publisher:
- The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
- Note:
- Pages:
- 3266–3284
- Language:
- URL:
- https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.174/
- DOI:
- Cite (ACL):
- Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner, Lukas Kinder, and Tobias Käfer. 2025. Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 3266–3284, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
- Cite (Informal):
- Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models (Noullet et al., IJCNLP-AACL 2025)
- PDF:
- https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.174.pdf