Luke Ong
2026
Spectra: A Mechanistic Interpretability Library for Vision-Language Models
Clement Neo | Yongsen Zheng | Kwok-Yan Lam | Luke Ong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Clement Neo | Yongsen Zheng | Kwok-Yan Lam | Luke Ong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Vision-Language Models (VLMs) have become increasingly important in AI applications, yet interpretability tools for these models lag behind those available for text-only language models. While libraries like TransformerLens have enabled significant progress in understanding language models, existing tools for VLMs are limited to basic activation probing and saving. We introduce Spectra, a library specifically designed for mechanistic interpretability of VLMs that provides unified abstractions for activation patching, attention pattern analysis, and meta-functions across diverse VLM architectures. Built on HuggingFace’s Transformers, our library handles architecture-specific complexities through per-checkpoint configurations while maintaining a simple, high-level interface. We demonstrate the library’s capabilities by performing interpretability experiments on a counting task, showing how researchers can easily perform experiments that were previously cumbersome to do. The library currently supports Qwen2.5-VL, Qwen3-VL, LLaVA 1.5 and SmolVLM, with a design that facilitates extension to additional architectures. The library can be found at github.com/clemneo/vlm-spectra.