Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models

Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner, Lukas Kinder, Tobias Käfer


Abstract
Entity disambiguation (ED) links ambiguous mentions in text to entries in a knowledge base and is a core task in entity linking systems. While pretrained decoder-only language models (DLMs) offer strong generalization capabilities, their effective use in ED has been restricted due to sensitivity to candidate order, susceptibility to hallucinated outputs, and potential dataset leakage. We introduce Agnus a zero-shot ED framework that addresses these challenges through three core innovations: (1) order-invariant candidate encoding via shared positional embeddings and modified autoregressive attention masking, which eliminates bias on input ordering; (2) constrained decoding that ensures outputs are restricted to valid candidates, effectively preventing hallucinations; and (3) synthetic dataset creation approach as a diagnostic tool for data contamination detection and mitigation. Agnus eliminates up to 15.2% of F1 variability caused by candidate permutations, delivering consistent and order-robust predictions previously unattainable with autoregressive architectures. In our experiments, Agnus achieves state-of-the-art performance on four standard ED benchmarks, surpassing prior zero-shot approaches by an average 3.7% using small language models. We release code, data including candidate sets, and a synthetic benchmark to support reproducibility and controlled evaluation.
Anthology ID:
2025.ijcnlp-long.174
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venues:
IJCNLP | AACL
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
3266–3284
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.174/
DOI:
Bibkey:
Cite (ACL):
Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner, Lukas Kinder, and Tobias Käfer. 2025. Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 3266–3284, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models (Noullet et al., IJCNLP-AACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.174.pdf