Semantically Comprehensive Token Pruning in LVLMs via Maximizing Concept Coverage

Xueting Li; Qi Liu; Chenghao Xu; Xu Yang; Guangtao Lyu; Jiahua Li; Cheng Deng

Semantically Comprehensive Token Pruning in LVLMs via Maximizing Concept Coverage

Xueting Li, Qi Liu, Chenghao Xu, Xu Yang, Guangtao Lyu, Jiahua Li, Cheng Deng

Abstract

High-resolution visual tokens impose substantial computational burdens owing to extreme redundancy in Large Visual Language Models (LVLMs). Existing visual token pruning methods typically leverage simple metrics derived from human experience, such as attention or similarity, to rank and select tokens within a highly entangled feature space. However, these metrics lack interpretability and often introduce human bias, failing to capture the genuine semantic significance of tokens, especially amidst the inherent semantic complexity and ambiguity of visual tokens. To mitigate this limitation, we propose a novel Semantically Comprehensive Token Selection (SCTS) method for unbiased, interpretable visual token pruning via a concept-driven paradigm. To unravel the model’s intrinsic semantic representation mechanism, we first introduce a Sparse Autoencoder to disentangle visual features into an interpretable space, with each dimension encoding a distinct semantic concept. We then formulate the token pruning task as a Maximum Concept Coverage problem, quantifying the Marginal Semantic Gain (MSG) of each token’s contribution to uncovered concepts and iteratively selecting tokens with the highest MSG. This concept-centric approach prioritizes tokens with unique semantic contributions, guaranteeing semantic comprehensiveness while preserving robust performance even at high compression ratios. Extensive experiments across multiple LVLM architectures and benchmarks verify that SCTS consistently outperforms state-of-the-art approaches, achieving a superior trade-off between computational efficiency and semantic completeness.

Anthology ID:: 2026.acl-long.1282
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27829–27846
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1282/
DOI:
Bibkey:
Cite (ACL):: Xueting Li, Qi Liu, Chenghao Xu, Xu Yang, Guangtao Lyu, Jiahua Li, and Cheng Deng. 2026. Semantically Comprehensive Token Pruning in LVLMs via Maximizing Concept Coverage. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 27829–27846, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Semantically Comprehensive Token Pruning in LVLMs via Maximizing Concept Coverage (Li et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1282.pdf
Checklist:: 2026.acl-long.1282.checklist.pdf

PDF Cite Search Checklist Fix data