From Questions to Assessment Tuples: A Multi-Agent Framework with Bloom-Specialized Agents and Automated Verification

Gee-Lyle Wong, Runcong Zhao, Yulan He, Jiazheng Li


Abstract
Automatic question generation with large language models has advanced rapidly, yet producing assessment-ready items, complete with mark schemes and expected answers, remains challenging, especially when generation must reliably target higher-order cognitive levels in Bloom’s Taxonomy. We propose a multi-agent, multi-stage framework that generates structured assessment tuples for both short-answer questions (SAQs) and scenario-based questions (SBQs), combining Bloom-specialized generation agents with staged decomposition and automated verification. We further introduce a rubric-guided LLM-as-a-judge evaluation framework with Bloom-specific alignment metrics. Experiments on university-level AI course material across five generation pipelines show that prompt-level Bloom conditioning alone is insufficient to reliably achieve cognitive control. In contrast, our structured approach yields consistent and notable improvements in alignment, mark scheme quality, and output yield, particularly for higher-order Bloom levels over baseline pipelines.
Anthology ID:
2026.bea-1.22
Volume:
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
Venues:
BEA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
292–335
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.22/
DOI:
Bibkey:
Cite (ACL):
Gee-Lyle Wong, Runcong Zhao, Yulan He, and Jiazheng Li. 2026. From Questions to Assessment Tuples: A Multi-Agent Framework with Bloom-Specialized Agents and Automated Verification. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 292–335, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
From Questions to Assessment Tuples: A Multi-Agent Framework with Bloom-Specialized Agents and Automated Verification (Wong et al., BEA 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.22.pdf