Multi Task Learning based Framework for Multimodal Classification

Danting Zeng

doi:10.18653/v1/2021.maiworkshop-1.5

Multi Task Learning based Framework for Multimodal Classification

Abstract

Large-scale multi-modal classification aim to distinguish between different multi-modal data, and it has drawn dramatically attentions since last decade. In this paper, we propose a multi-task learning-based framework for the multimodal classification task, which consists of two branches: multi-modal autoencoder branch and attention-based multi-modal modeling branch. Multi-modal autoencoder can receive multi-modal features and obtain the interactive information which called multi-modal encoder feature, and use this feature to reconstitute all the input data. Besides, multi-modal encoder feature can be used to enrich the raw dataset, and improve the performance of downstream tasks (such as classification task). As for attention-based multimodal modeling branch, we first employ attention mechanism to make the model focused on important features, then we use the multi-modal encoder feature to enrich the input information, achieve a better performance. We conduct extensive experiments on different dataset, the results demonstrate the effectiveness of proposed framework.

Anthology ID:: 2021.maiworkshop-1.5
Volume:: Proceedings of the Third Workshop on Multimodal Artificial Intelligence
Month:: June
Year:: 2021
Address:: Mexico City, Mexico
Editors:: Amir Zadeh, Louis-Philippe Morency, Paul Pu Liang, Candace Ross, Ruslan Salakhutdinov, Soujanya Poria, Erik Cambria, Kelly Shi
Venue:: maiworkshop
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30–35
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.maiworkshop-1.5/
DOI:: 10.18653/v1/2021.maiworkshop-1.5
Bibkey:
Cite (ACL):: Danting Zeng. 2021. Multi Task Learning based Framework for Multimodal Classification. In Proceedings of the Third Workshop on Multimodal Artificial Intelligence, pages 30–35, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Multi Task Learning based Framework for Multimodal Classification (Zeng, maiworkshop 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.maiworkshop-1.5.pdf

PDF Cite Search Fix data