Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis

Huijun Lian, Zekai Sun, Keqi Chen, Yingming Gao, Ya Li


Abstract
Commonsense question answering (QA) are widely used to evaluate the commonsense abilities of large language models. However, answering commonsense questions correctly requires not only knowledge but also reasoning—even for seemingly simple questions. We demonstrate that such hidden reasoning attributes in commonsense questions can lead evaluation accuracy differences of up to 24.8% across different difficulty levels in the same benchmark. Current benchmarks overlook these hidden reasoning attributes, making it difficult to assess a model’s specific levels of commonsense knowledge and reasoning ability. To address this issue, we introduce ReComSBench, a novel framework that reveals hidden reasoning attributes behind commonsense questions by leveraging the knowledge generated during the reasoning process. Additionally, ReComSBench proposes three new metrics for decoupled evaluation: Knowledge Balanced Accuracy, Marginal Sampling Gain, and Knowledge Coverage Ratio. Experiments show that ReComSBench provides insights into model performance that traditional benchmarks cannot offer. The difficulty stratification based on revealed hidden reasoning attributes performs as effectively as the model-probability-based approach but is more generalizable and better suited for improving a model’s commonsense reasoning abilities. By uncovering and analyzing the hidden reasoning attributes in commonsense data, ReComSBench offers a new approach to enhancing existing commonsense benchmarks.
Anthology ID:
2025.acl-long.627
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12820–12835
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.627/
DOI:
Bibkey:
Cite (ACL):
Huijun Lian, Zekai Sun, Keqi Chen, Yingming Gao, and Ya Li. 2025. Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12820–12835, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis (Lian et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.627.pdf