Chengjie Sun
Other people with similar names: Cheng-Jie Sun (孙承杰)
2026
Time-for-Accuracy: Formalizing Chain-of-Thought as an Expansion of Logical Depth
Yue Wang | Zhi Zhang | Wang Xi | Chengjie Sun | Lili Shan | Bingquan Liu
Findings of the Association for Computational Linguistics: ACL 2026
Yue Wang | Zhi Zhang | Wang Xi | Chengjie Sun | Lili Shan | Bingquan Liu
Findings of the Association for Computational Linguistics: ACL 2026
Chain-of-thought (CoT) often improves multi-step reasoning, but it remains unclear what kind of additional sequential computation longer traces actually enable. We connect CoT to Bennett’s logical depth, separating an answer’s description length from the sequential effort required to derive it, and view a CoT budget of T steps as a qualitative cap on realizable sequential computation. To operationalize realized depth beyond raw length, we introduce Effective Logical Depth (ELD), a deletion-based measure of step necessity under a specified inference interface. Across depth-controlled prefix-sum tasks and GSM8K rationale perturbations, we observe two consistent signatures of a Time-for-Accuracy tradeoff: (i) plateau-to-transition accuracy curves as the budget increases from being below to matching the task’s required depth, and (ii) sparse, position-dependent deletion sensitivity concentrated in early steps for deeper instances. On GSM8K, an Extract interface, where the model reads off the answer from the remaining rationale, remains near-perfect even after prefix deletions, whereas a Repair interface, where the model must re-solve from truncated rationale context, degrades markedly. Moreover, Socratic human rationales are consistently more robust than Main rationales under Repair. These results suggest that longer CoT helps primarily when it enables additional effective sequential computation, and that deletion-based diagnostics can distinguish computational steps from redundant ones.