CogBench: Benchmarking Cognitive Alignment of Large Language Models in Educational Question Answering

Tong Lu; Zhichun Wang (王志春); Yuanhao Sun; Yaoyu Zhou; Mingrui Li (李明锐); Yiming Guan; Zhiyong Bai

CogBench: Benchmarking Cognitive Alignment of Large Language Models in Educational Question Answering

Tong Lu, Zhichun Wang, Yuanhao Sun, Yaoyu Zhou, Mingrui Li, Yiming Guan, Zhiyong Bai

Abstract

Large language models (LLMs) possess strong capabilities in language understanding and generation, as well as remarkable problem-solving abilities. In the educational domain, a representative application is to employ LLMs as learning assistants that answer students’ questions and support their learning processes. In such scenarios, it is crucial for the model to perceive a student’s cognitive level and provide explanations that are appropriate to that level. However, whether LLMs can effectively accomplish this task has not yet been thoroughly investigated. To address this gap, we introduce CogBench, an evaluation benchmark designed to assess the cognitive alignment capabilities of LLMs in educational QA. CogBench comprises 2.1K mathematics questions, each associated with multiple valid solutions that rely on knowledge and reasoning at different cognitive levels. Building on this structure, we formulate three cognition-aware evaluation tasks and propose three complementary metrics to quantify cognitive alignment from multiple perspectives. Extensive experiments on 11 representative LLMs reveal that, while models can often produce correct answers, they still struggle to consistently generate explanations that are aligned with the intended cognitive level. These results highlight substantial room for improvement and establish CogBench as a diagnostic benchmark for advancing cognitively aligned educational AI systems.

Anthology ID:: 2026.findings-acl.1068
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21242–21256
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1068/
DOI:
Bibkey:
Cite (ACL):: Tong Lu, Zhichun Wang, Yuanhao Sun, Yaoyu Zhou, Mingrui Li, Yiming Guan, and Zhiyong Bai. 2026. CogBench: Benchmarking Cognitive Alignment of Large Language Models in Educational Question Answering. In Findings of the Association for Computational Linguistics: ACL 2026, pages 21242–21256, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: CogBench: Benchmarking Cognitive Alignment of Large Language Models in Educational Question Answering (Lu et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1068.pdf
Checklist:: 2026.findings-acl.1068.checklist.pdf

PDF Cite Search Checklist Fix data