Multimodal Large Language Models for Human-AI Interaction: Foundations, Agents, and Inclusive Applications

Shafiq Joty, Enamul Hoque, Ahmed Masry, Spandana Gella, Samira Ebrahimi Kahou


Abstract
This tutorial presents foundations, agentic capabilities, and inclusive applications of multimodal large language models, covering architectures, multimodal alignment and reasoning, conversational GUI agents, accessibility, multilingual communication, and responsible deployment.
Anthology ID:
2026.eacl-tutorials.4
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 6: Tutorials)
Month:
March
Year:
2026
Address:
St. Julian's, Malta
Editors:
Chenghua Lin, Aline Paes, Rodrigo Wilkens
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–11
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-tutorials.4/
DOI:
Bibkey:
Cite (ACL):
Shafiq Joty, Enamul Hoque, Ahmed Masry, Spandana Gella, and Samira Ebrahimi Kahou. 2026. Multimodal Large Language Models for Human-AI Interaction: Foundations, Agents, and Inclusive Applications. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 6: Tutorials), pages 9–11, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):
Multimodal Large Language Models for Human-AI Interaction: Foundations, Agents, and Inclusive Applications (Joty et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-tutorials.4.pdf