Do Large Language Models excel in Complex Logical Reasoning with Formal Language?

Jin Jiang; Jianing Wang; Yuchen Yan; Yang Liu (刘扬); Jianhua Zhu; Mengdi Zhang; Liangcai Gao

Do Large Language Models excel in Complex Logical Reasoning with Formal Language?

Jin Jiang, Jianing Wang, Yuchen Yan, Yang Liu, Jianhua Zhu, Mengdi Zhang, Liangcai Gao

Abstract

Large Language Models (LLMs) have been shown to achieve breakthrough performances on complex logical reasoning tasks. Nevertheless, most existing research focuses on employing formal language to guide LLMs for deriving reliable reasoning paths, with systematic evaluations of these capabilities still being limited. In this paper, we aim to conduct a comprehensive evaluation of LLMs across various logical reasoning problems utilizing formal languages. From the perspective of three dimensions, i.e., spectrum of LLMs, taxonomy of tasks, and format of trajectories, our key findings are: 1) Thinking models significantly outperform Instruct models, especially when formal language is employed; 2). All LLMs exhibit limitations in inductive reasoning capability, irrespective of whether they use a formal language; 3). Data with PoT format achieves the best generalization performance across other languages. Additionally, we also curate the formal-relative training data to further enhance the small language models, and the experimental results indicate that a simple rejected fine-tuning method can better enable LLMs to generalize across formal languages and achieve the best overall performance.

Anthology ID:: 2025.emnlp-main.855
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16889–16914
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.855/
DOI:
Bibkey:
Cite (ACL):: Jin Jiang, Jianing Wang, Yuchen Yan, Yang Liu, Jianhua Zhu, Mengdi Zhang, and Liangcai Gao. 2025. Do Large Language Models excel in Complex Logical Reasoning with Formal Language?. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 16889–16914, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Do Large Language Models excel in Complex Logical Reasoning with Formal Language? (Jiang et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.855.pdf
Checklist:: 2025.emnlp-main.855.checklist.pdf

PDF Cite Search Checklist Fix data