Assessing the Belief Consistency of Large Language Models on the Logical Conversation Process
Tomoki Tsujimura, Mat\={i}ss Rikters, Masaki Asada, Shusaku Egami, Tatsuya Ishigaki, Ken Yano, Hiroya Takamura
Abstract
To reliably interpret the evolving context of an LLM as a reasoning trace, the underlying belief of the LLM needs to transition consistently with the progression of the context.We focus on evaluating whether the beliefs held by a model remain consistent before and after the extension of the context.Previous research on consistency evaluation typically uses datasets with ground-truth answers, which is problematic because task-solving ability acts as a confounding factor, obscuring the direct evaluation of consistency.Furthermore, evaluating cases where inconsistency stems from multiple errors poses difficulties.We propose a new evaluation method to assess the consistency of LLMs in a multiple-choice question answering format, designed so that any option chosen is correct, allowing for the evaluation of the proposed belief consistency.It also supports isolation of errors such as reasoning failures and biases.We reveal that the belief consistency does not improve solely with model size scaling,whereas continual pre-training on code and mathematics text improves it.Furthermore, models trained on code and mathematics text show a seemingly contradictory result of increased logical failures, indicating that belief consistency and superficial consistency are not necessarily directly linked.- Anthology ID:
- 2026.acl-long.1860
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 40032–40055
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1860/
- DOI:
- Cite (ACL):
- Tomoki Tsujimura, Mat\={i}ss Rikters, Masaki Asada, Shusaku Egami, Tatsuya Ishigaki, Ken Yano, and Hiroya Takamura. 2026. Assessing the Belief Consistency of Large Language Models on the Logical Conversation Process. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 40032–40055, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Assessing the Belief Consistency of Large Language Models on the Logical Conversation Process (Tsujimura et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1860.pdf