Lukas Kinder

2025

pdf bib abs
Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding
Lukas Kinder | Lukas Edman | Alexander Fraser | Tobias Käfer
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large Language Models (LLMs) typically track the order of tokens using positional encoding, which causes the following problems: positional bias, where the model is influenced by an ordering within the prompt, and a fixed context window, as models struggle to generalize to positions beyond those encountered during training. To address these limitations, we developed a novel method called set encoding. This method allows multiple pieces of text to be encoded in the same position, thereby eliminating positional bias entirely. Another promising use case for set encoding is to increase the size of the input an LLM can handle. Our experiments demonstrate that set encoding allows an LLM to solve tasks with far more tokens than without set encoding. To our knowledge, set encoding is the first technique to effectively extend an LLM’s context window without requiring any additional training.

pdf bib abs
Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models
Kristian Noullet | Ayoub Ourgani | Niklas Thomas Lakner | Lukas Kinder | Tobias Käfer
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Entity disambiguation (ED) links ambiguous mentions in text to entries in a knowledge base and is a core task in entity linking systems. While pretrained decoder-only language models (DLMs) offer strong generalization capabilities, their effective use in ED has been restricted due to sensitivity to candidate order, susceptibility to hallucinated outputs, and potential dataset leakage. We introduce Agnus a zero-shot ED framework that addresses these challenges through three core innovations: (1) order-invariant candidate encoding via shared positional embeddings and modified autoregressive attention masking, which eliminates bias on input ordering; (2) constrained decoding that ensures outputs are restricted to valid candidates, effectively preventing hallucinations; and (3) synthetic dataset creation approach as a diagnostic tool for data contamination detection and mitigation. Agnus eliminates up to 15.2% of F1 variability caused by candidate permutations, delivering consistent and order-robust predictions previously unattainable with autoregressive architectures. In our experiments, Agnus achieves state-of-the-art performance on four standard ED benchmarks, surpassing prior zero-shot approaches by an average 3.7% using small language models. We release code, data including candidate sets, and a synthetic benchmark to support reproducibility and controlled evaluation.

Co-authors

Ayoub Ourgani 1

Venues

Fix author