On the Consistency of Commonsense in Large Language Models
Guozheng Li, Peng Wang, Wenjun Ke, Zijie Xu, Jiajun Liu, Ziyu Shang
Abstract
Commonsense, humans’ implicit understanding of everyday situations, is crucial for large language models (LLMs). Existing commonsense evaluations for LLMs primarily focus on downstream knowledge tasks, failing to probe whether LLMs truly understand and utilize knowledge or merely memorize it. They also rely heavily on human annotation and lack automated large-scale data generation. To address this, we propose to automatically construct a large benchmark named CoCo (Consistency of Commonsense) comprising 39K samples derived from commonsense knowledge graphs (CSKGs), paired with symbolic questions and ground-truth answers, which systematically assesses LLMs’ knowledge memorization, comprehension, and application and examines the consistency between these tasks. To enhance our evaluation, we also propose novel metrics and prompting strategies. Experimental results on multiple LLMs reveal that CoCo presents significant challenges, and our detailed analysis provides deeper insights into the strengths and limitations of LLMs’ commonsense abilities.- Anthology ID:
- 2025.findings-acl.834
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venues:
- Findings | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 16205–16225
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.834/
- DOI:
- Cite (ACL):
- Guozheng Li, Peng Wang, Wenjun Ke, Zijie Xu, Jiajun Liu, and Ziyu Shang. 2025. On the Consistency of Commonsense in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 16205–16225, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- On the Consistency of Commonsense in Large Language Models (Li et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.834.pdf