Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

Yifan Zhang; Wenyu Du; Dongming Jin; Jie Fu; Zhi Jin

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

Yifan Zhang, Wenyu Du, Dongming Jin, Jie Fu, Zhi Jin

Abstract

Chain-of-thought (CoT) significantly enhances the performance of large language models (LLMs) across a wide range of tasks, and prior research shows that CoT can theoretically increase expressiveness. However, there is limited mechanistic understanding of the algorithms that Transformer+CoT can learn. Our key contributions are: (1) We evaluate the state tracking capabilities of Transformer+CoT and its variants, confirming the effectiveness of CoT. (2) Next, we identify the circuit (a subset of model components, responsible for tracking the world state), indicating that late-layer MLP neurons play a key role. We propose two metrics, compression and distinction, and show that the neuron sets for each state achieve nearly 100% accuracy, providing evidence of an implicit finite state automaton (FSA) embedded within the model. (3) Additionally, we explore three challenging settings: skipping intermediate steps, introducing data noises, and testing length generalization. Our results demonstrate that Transformer+CoT learns robust algorithms (FSAs), highlighting its resilience in challenging scenarios. Our code is available at https://github.com/IvanChangPKU/FSA.

Anthology ID:: 2025.acl-long.668
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13603–13621
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.acl-long.668/
DOI:
Bibkey:
Cite (ACL):: Yifan Zhang, Wenyu Du, Dongming Jin, Jie Fu, and Zhi Jin. 2025. Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13603–13621, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking (Zhang et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.acl-long.668.pdf

PDF Cite Search Fix data