Can Multimodal LLMs Generate Pedagogical Questions?
Thomas Gerald, Sahar Ghannay, Julie Lascar, Paul Lerner, Anne Vilnat
Abstract
Educational materials frequently combine text, diagrams, tables, and charts to convey complex concepts. Understanding such materials often requires reasoning across modalities rather than relying solely on textual descriptions. In educational contexts, the main challenge lies in assessing the relevance and quality of the questions themselves. This raises a key issue: what defines a good question in a specialized learning environment? By comparison, evaluating answers is a more conventional task, although it requires examining criteria consistent with the targeted educational level. To the best of our knowledge, the use of LLMs for assessing the pedagogical relevance of questions remains unexplored. This gap highlights the need to define pedagogical relevance more clearly and to investigate the consistency of LLM judgments, as well as their alignment with human evaluations. We introduce a new Multimodal QA dataset in the education domain. To reduce the need for extensive human annotation, we leverage LLMs to help design questions on educational material, jointly with a human annotation. Contrary to most of QA Multimodal corpora, we focus on questions that could be asked by a teacher in his/her class, and that need dealing with different parts of the document to be answered. Results show that while LLMs as a judge is an efficient framework, many problem could arise and that align prediction with human annotators is a difficult task for complex criteria.- Anthology ID:
- 2026.lrec-main.429
- Volume:
- Proceedings of the Fifteenth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2026
- Address:
- Palma de Mallorca, Spain
- Editors:
- Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
- Venue:
- LREC
- SIG:
- Publisher:
- ELRA Language Resource Association
- Note:
- Pages:
- 5506–5515
- Language:
- URL:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.429/
- DOI:
- Cite (ACL):
- Thomas Gerald, Sahar Ghannay, Julie Lascar, Paul Lerner, and Anne Vilnat. 2026. Can Multimodal LLMs Generate Pedagogical Questions?. International Conference on Language Resources and Evaluation, main:5506–5515.
- Cite (Informal):
- Can Multimodal LLMs Generate Pedagogical Questions? (Gerald et al., LREC 2026)
- PDF:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.429.pdf