Tracing Logit Trajectories Across Layer Depth: Dataset-Level Explainability for Language Models

Jeesu Jung; Sangkeun Jung

Tracing Logit Trajectories Across Layer Depth: Dataset-Level Explainability for Language Models

Abstract

Sentence-level explanations can miss the bigger picture of how a black-box model behaves across data, which matters most for complex criteria like safety that cannot be defined by a single rule. We trace **Logit-Trajectory**, which tracks adjacent-layer logit updates as vectors and aggregates them into a reproducible dataset-level trajectory pattern, enabling depth-wise explainability through signals such as coherence and angular rotation. Across 6 languages and 5 NLP tasks, we show these trajectory summaries reveal consistent depth-wise patterns that divergence- and similarity-based baselines often wash out due to scalarization. As a case study where dataset-level intermediate decision structure matters, we evaluate safety classification, reporting both trajectory-level visual separability and classification performance.

Anthology ID:: 2026.acl-long.809
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17800–17823
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.809/
DOI:
Bibkey:
Cite (ACL):: Jeesu Jung and Sangkeun Jung. 2026. Tracing Logit Trajectories Across Layer Depth: Dataset-Level Explainability for Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 17800–17823, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Tracing Logit Trajectories Across Layer Depth: Dataset-Level Explainability for Language Models (Jung & Jung, ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.809.pdf
Checklist:: 2026.acl-long.809.checklist.pdf

PDF Cite Search Checklist Fix data