Tutorial on Multimodal Machine Learning

Louis-Philippe Morency, Paul Pu Liang, Amir Zadeh


Abstract
Multimodal machine learning involves integrating and modeling information from multiple heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world applications in multimedia, affective computing, robotics, finance, HCI, and healthcare. This tutorial, building upon a new edition of a survey paper on multimodal ML as well as previously-given tutorials and academic courses, will describe an updated taxonomy on multimodal machine learning synthesizing its core technical challenges and major directions for future research.
Anthology ID:
2022.naacl-tutorials.5
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Miguel Ballesteros, Yulia Tsvetkov, Cecilia O. Alm
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–38
Language:
URL:
https://aclanthology.org/2022.naacl-tutorials.5
DOI:
10.18653/v1/2022.naacl-tutorials.5
Bibkey:
Cite (ACL):
Louis-Philippe Morency, Paul Pu Liang, and Amir Zadeh. 2022. Tutorial on Multimodal Machine Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts, pages 33–38, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Tutorial on Multimodal Machine Learning (Morency et al., NAACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.naacl-tutorials.5.pdf
Data
Visual Question Answering