Lexical Familiarity Predicts Processing Depth for Nonliteral Language in Large Language Models

Lang-Ching Yeh, Yu-Chieh Wang, Shu-Kai Hsieh


Abstract
This paper investigates how large language models internally process nonliteral language. Analyzing five categories spanning slang, metaphor, and idioms across all 48 layers of Gemma-3-12B-IT with Gemma Scope 2 sparse autoencoders, we find a lexical familiarity gradient: processing depth depends on available prior lexical knowledge, not figurative type. Idioms diverge at L1 as entrenched units; expressions built from familiar words (metaphors, semantic-shift and constructional slang) converge at L7–9; neologisms peak at L41, activating 3× more unique features. Paraphrase residual analysis confirms strong signals only at the gradient endpoints, yielding a three-tier hierarchy of entrenched retrieval, known-word reanalysis, and novel-word construction. Crucially, this peak-layer structure replicates in base models (Gemma-PT, Qwen-Base), demonstrating that the gradient is a robust property of pretrained representations rather than an alignment artifact. We additionally identify an activation density confound in SAE feature counts that produces spurious cross-condition convergence. Overall, processing depth is better predicted by lexical familiarity than by figurative type, with implications for robustness to non-standard language and for SAE-based interpretability.
Anthology ID:
2026.trustnlp-main.32
Volume:
Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026)
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Kai-Wei Chang, Ninareh Mehrabi, Satyapriya Krishna, Anubrata Das, Jwala Dhamala, Yang Trista Cao, Tharindu Kumarage, Anil Ramakrishna, Christos Christodoulopoulos, Yixin Wan, Aram Galystan, Anoop Kumar, Rahul Gupta
Venues:
TrustNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
456–470
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.32/
DOI:
Bibkey:
Cite (ACL):
Lang-Ching Yeh, Yu-Chieh Wang, and Shu-Kai Hsieh. 2026. Lexical Familiarity Predicts Processing Depth for Nonliteral Language in Large Language Models. In Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026), pages 456–470, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
Lexical Familiarity Predicts Processing Depth for Nonliteral Language in Large Language Models (Yeh et al., TrustNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.32.pdf