UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks

Zhendong Chu; Jian. Xie; Shen Wang; Zichao Wang; Qingsong Wen

UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks

Zhendong Chu, Jian Xie, Shen Wang, Zichao Wang, Qingsong Wen

Abstract

Education materials for K-12 students often consist of multiple modalities, such as text and images, posing challenges for models to fully understand nuanced information in these materials. In this paper, we propose a unified language and vision assistant UniEDU designed for various educational applications, including knowledge recommendation, knowledge tracing, time cost prediction, and user answer prediction, all within a single model. Unlike conventional task-specific models, UniEDU offers a unified solution that excels across multiple educational tasks while maintaining strong generalization capabilities. Its adaptability makes it well-suited for real-world deployment in diverse learning environments. Furthermore, UniEDU is optimized for industry-scale deployment by significantly reducing computational overhead—achieving approximately a 300% increase in efficiency—while maintaining competitive performance with minimal degradation compared to fully fine-tuned models. This work represents a significant step toward creating versatile AI systems tailored to the evolving demands of education.

Anthology ID:: 2025.emnlp-industry.68
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2025
Address:: Suzhou (China)
Editors:: Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1007–1016
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.68/
DOI:
Bibkey:
Cite (ACL):: Zhendong Chu, Jian Xie, Shen Wang, Zichao Wang, and Qingsong Wen. 2025. UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1007–1016, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):: UniEDU: Toward Unified and Efficient Large Multimodal Models for Educational Tasks (Chu et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.68.pdf

PDF Cite Search Fix data