Automated Knowledge Component Generation and Interpretable Knowledge Tracing in Coding Problems

Zhangqi Duan; Nigel Fernandez; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Rafaella Sampaio de Alencar; Peter Brusilovsky; Bita Akram; Andrew Lan

Automated Knowledge Component Generation and Interpretable Knowledge Tracing in Coding Problems

Zhangqi Duan, Nigel Fernandez, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany, Rafaella Sampaio de Alencar, Peter Brusilovsky, Bita Akram, Andrew Lan

Abstract

Knowledge components (KCs) are key to assessing student knowledge levels on fine-grained skills and driving personalization and feedback. However, crafting KCs and tagging them for problems, traditionally performed by human domain experts, is highly labor-intensive. Prior work has studied automated KC generation only for multiple-choice questions but not open-ended ones. We bridge this gap and present an automated, large language model (LLM)-based pipeline for KC generation and tagging for open-ended programming problems. We also develop an LLM-based knowledge tracing (KT) framework to leverage these LLM-generated KCs. We conduct extensive quantitative and qualitative evaluations on two real-world student code submission datasets. Results show that our KT method outperforms existing ones and LLM-generated KCs outperform human-written KCs on future student response prediction. We also investigate how these KCs enable us to analyze student learning curves and conduct human evaluation with course instructors to further verify the quality of KC-problem tagging.

Anthology ID:: 2026.findings-acl.1670
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33405–33423
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1670/
DOI:
Bibkey:
Cite (ACL):: Zhangqi Duan, Nigel Fernandez, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany, Rafaella Sampaio de Alencar, Peter Brusilovsky, Bita Akram, and Andrew Lan. 2026. Automated Knowledge Component Generation and Interpretable Knowledge Tracing in Coding Problems. In Findings of the Association for Computational Linguistics: ACL 2026, pages 33405–33423, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Automated Knowledge Component Generation and Interpretable Knowledge Tracing in Coding Problems (Duan et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1670.pdf
Checklist:: 2026.findings-acl.1670.checklist.pdf

PDF Cite Search Checklist Fix data