Xinglang Zhang

2026

Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning
Xinglang Zhang | Yunyao Zhang | ZeLiang Chen | Junqing Yu | Wei Yang | Zikai Song
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Symbolic logical reasoning is a critical yet underexplored capability of large language models (LLMs), providing reliable and verifiable decision-making in high-stakes domains such as mathematical reasoning and legal judgment. In this study, we present a systematic analysis of logical reasoning under controlled increases in logical complexity, and reveal a previously unrecognized phenomenon, which we term **Logical Phase Transitions**: rather than degrading smoothly, logical reasoning performance remains stable within a regime but collapses abruptly beyond a critical logical depth, mirroring physical phase transitions such as water freezing beyond a critical temperature threshold. Building on this insight, we propose **Neuro-Symbolic Curriculum Tuning**, a principled framework that adaptively aligns natural language with logical symbols to establish a shared representation, and reshapes training dynamics around phase-transition boundaries to progressively strengthen reasoning at increasing logical depths. Experiments on five benchmarks show that our approach effectively mitigates logical reasoning collapse at high complexity, yielding average accuracy gains of +1.26 in naive prompting and +3.95 in CoT, while improving generalization to unseen logical compositions.

pdf bib abs

Logical reasoning is a fundamental capability of large language models (LLMs). However, existing studies largely overlook the interplay between logical complexity and semantic complexity, limiting their robustness under abstract propositions, ambiguous contexts, and conflicting stances, which are central to human reasoning. We propose **LogicAgent**, a semiotic-square–guided framework that jointly addresses these two axes of difficulty. The semiotic square provides a principled structure for multi-perspective semantic analysis, and LogicAgent integrates automated deduction with reflective verification to manage logical complexity across deeper reasoning chains. To evaluate reasoning under coupled semantic and logical complexity, we introduce **RepublicQA**, a benchmark that contains abstract propositions with systematically constructed contrary and contradictory forms, providing a semantically rich setting for assessing logical reasoning in LLMs. Experiments show that LogicAgent achieves state-of-the-art performance on RepublicQA with a 6.25% average gain, and generalizes well to four mainstream logical reasoning benchmarks with an additional 7.05% improvement, highlighting the effectiveness of our semiotic-grounded multi-perspective reasoning in boosting LLMs’ logical performance.

Co-authors

ZeLiang Chen 1

Wenbing Li 1

Junxi Sheng 1

Venues

ACL2

Fix author