Qiner Lyu
2026
AwarenessBench: Assessing Cognitive Capabilities of Language Models
Xiaojian Li | Rongwu Xu | Tianyun Zhang | Yue Wang | Shuo Chen | Qiner Lyu | Briana Zhang | Peiran Yang | Kyle Xue Chen | Haoyuan Shi | Yu Wang | Wei Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiaojian Li | Rongwu Xu | Tianyun Zhang | Yue Wang | Shuo Chen | Qiner Lyu | Briana Zhang | Peiran Yang | Kyle Xue Chen | Haoyuan Shi | Yu Wang | Wei Xu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
As language models (LMs) exhibit increasingly consciousness-like behaviors, evaluating their cognitive abilities becomes essential. We introduce AwarenessBench, the first comprehensive benchmark for assessing the cognitive abilities of LMs in four dimensions: metacognition, self-awareness, social awareness, and situational awareness, covering 15 cognitive functions and 14,381 samples. Evaluating 18 state-of-the-art LMs, we find that all consistently surpass random baselines, with more advanced models performing better. We further compare LMs with human performance across three demographic groups, where the best-performing model surpasses human averages overall, but most still fall markedly short in metacognition and self-awareness. Finally, we show that awareness is a distinct capability: progress in language modeling or reasoning does not necessarily translate into improved cognition.