Abstract
With the development of multimodal systems and natural language generation techniques, the resurgence of multimodal datasets has attracted significant research interests, which aims to provide new information to enrich the representation of textual data. However, there remains a lack of a comprehensive survey for this task. To this end, we take the first step and present a thorough review of this research field. This paper provides an overview of a publicly available dataset with different modalities according to the applications. Furthermore, we discuss the new frontier and give our thoughts. We hope this survey of multimodal datasets can provide the community with quick access and a general picture of the multimodal dataset for specific Natural Language Processing (NLP) applications and motivates future researches. In this context, we release the collection of all multimodal datasets easily accessible here: https://github.com/drmuskangarg/Multimodal-datasets- Anthology ID:
- 2022.lrec-1.738
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 6837–6847
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.738
- DOI:
- Cite (ACL):
- Muskan Garg, Seema Wazarkar, Muskaan Singh, and Ondřej Bojar. 2022. Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6837–6847, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers (Garg et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2022.lrec-1.738.pdf
- Code
- drmuskangarg/multimodal-datasets
- Data
- CMU-MOSEI, GQA, IEMOCAP, IMDb Movie Reviews, MDID, MELD, MS COCO, MemexQA, R2VQ, Screen2Words, SumMe, TDIUC, TGIF-QA, TVQA, VATEX