Logical forms complement probability in understanding language model (and human) performance

Yixuan Wang, Freda Shi


Abstract
With the increasing interest in using large language models (LLMs) for planning in natural language, understanding their behaviors becomes an important research question. This work conducts a systematic investigation of LLMs’ ability to perform logical reasoning in natural language. We introduce a controlled dataset of hypothetical and disjunctive syllogisms in propositional and modal logic and use it as the testbed for understanding LLM performance. Our results lead to novel insights in predicting LLM behaviors: in addition to the probability of input, logical forms should be considered as important factors. In addition, we show similarities and discrepancies between the logical reasoning performances of humans and LLMs by collecting and comparing behavioral data from both.
Anthology ID:
2025.acl-long.824
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16862–16877
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.824/
DOI:
Bibkey:
Cite (ACL):
Yixuan Wang and Freda Shi. 2025. Logical forms complement probability in understanding language model (and human) performance. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16862–16877, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Logical forms complement probability in understanding language model (and human) performance (Wang & Shi, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.824.pdf