Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers

Muskan Garg; Seema Wazarkar; Muskaan Singh; Ondřej Bojar

Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers

Muskan Garg, Seema Wazarkar, Muskaan Singh, Ondřej Bojar

Abstract

With the development of multimodal systems and natural language generation techniques, the resurgence of multimodal datasets has attracted significant research interests, which aims to provide new information to enrich the representation of textual data. However, there remains a lack of a comprehensive survey for this task. To this end, we take the first step and present a thorough review of this research field. This paper provides an overview of a publicly available dataset with different modalities according to the applications. Furthermore, we discuss the new frontier and give our thoughts. We hope this survey of multimodal datasets can provide the community with quick access and a general picture of the multimodal dataset for specific Natural Language Processing (NLP) applications and motivates future researches. In this context, we release the collection of all multimodal datasets easily accessible here: https://github.com/drmuskangarg/Multimodal-datasets

Anthology ID:: 2022.lrec-1.738
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 6837–6847
Language:
URL:: https://aclanthology.org/2022.lrec-1.738
DOI:
Bibkey:
Cite (ACL):: Muskan Garg, Seema Wazarkar, Muskaan Singh, and Ondřej Bojar. 2022. Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6837–6847, Marseille, France. European Language Resources Association.
Cite (Informal):: Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers (Garg et al., LREC 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp22-frontmatter/2022.lrec-1.738.pdf
Code: drmuskangarg/multimodal-datasets
Data: CMU-MOSEI, GQA, IEMOCAP, IMDb Movie Reviews, MDID, MELD, MS COCO, MemexQA, R2VQ, Screen2Words, SumMe, TDIUC, TGIF-QA, TVQA, VATEX

PDF Search Code