@inproceedings{abbasi-etal-2025-normxlogit,
    title = "{N}orm{XL}ogit: The Head-on-Top Never Lies",
    author = "Abbasi, Sina  and
      Modarres, Mohammad Reza  and
      Pilehvar, Mohammad Taher",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1769/",
    pages = "34914--34935",
    ISBN = "979-8-89176-332-6",
    abstract = "With new large language models (LLMs) emerging frequently, it is important to consider the potential value of model-agnostic approaches that can provide interpretability across a variety of architectures. While recent advances in LLM interpretability show promise, many rely on complex, model-specific methods with high computational costs. To address these limitations, we propose NormXLogit, a novel technique for assessing the significance of individual input tokens. This method operates based on the input and output representations associated with each token. First, we demonstrate that the norm of word embeddings can be utilized as a measure of token importance. Second, we reveal a significant relationship between a token{'}s importance and how predictive its representation is of the model{'}s final output. Extensive analyses indicate that our approach outperforms existing gradient-based methods in terms of faithfulness and offers competitive performance compared to leading architecture-specific techniques."
}Markdown (Informal)
[NormXLogit: The Head-on-Top Never Lies](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1769/) (Abbasi et al., EMNLP 2025)
ACL
- Sina Abbasi, Mohammad Reza Modarres, and Mohammad Taher Pilehvar. 2025. NormXLogit: The Head-on-Top Never Lies. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34914–34935, Suzhou, China. Association for Computational Linguistics.