Abstract
In this paper, we present a multimodal architecture to determine the emotion expressed in a meme. This architecture utilizes both textual and visual information present in a meme. To extract image features we experimented with pre-trained VGG-16 and Inception-V3 classifiers and to extract text features we used LSTM and BERT classifiers. Both FastText and GloVe embeddings were experimented with for the LSTM classifier. The best F1 scores our classifier obtained on the official analysis results are 0.3309, 0.4752, and 0.2897 for Task A, B, and C respectively in the Memotion Analysis task (Task 8) organized as part of International Workshop on Semantic Evaluation 2020 (SemEval 2020). In our study, we found that combining both textual and visual information expressed in a meme improves the performance of the classifier as opposed to using standalone classifiers that use only text or visual data.- Anthology ID:
- 2020.semeval-1.112
- Volume:
- Proceedings of the Fourteenth Workshop on Semantic Evaluation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona (online)
- Venue:
- SemEval
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- International Committee for Computational Linguistics
- Note:
- Pages:
- 885–890
- Language:
- URL:
- https://aclanthology.org/2020.semeval-1.112
- DOI:
- 10.18653/v1/2020.semeval-1.112
- Cite (ACL):
- Arup Baruah, Kaushik Das, Ferdous Barbhuiya, and Kuntal Dey. 2020. IIITG-ADBU at SemEval-2020 Task 8: A Multimodal Approach to Detect Offensive, Sarcastic and Humorous Memes. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 885–890, Barcelona (online). International Committee for Computational Linguistics.
- Cite (Informal):
- IIITG-ADBU at SemEval-2020 Task 8: A Multimodal Approach to Detect Offensive, Sarcastic and Humorous Memes (Baruah et al., SemEval 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.semeval-1.112.pdf