MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities

Giuseppe Attanasio; Debora Nozza; Federico Bianchi

doi:10.18653/v1/2022.semeval-1.90

MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities

Giuseppe Attanasio, Debora Nozza, Federico Bianchi

Abstract

In this paper, we describe the system proposed by the MilaNLP team for the Multimedia Automatic Misogyny Identification (MAMI) challenge. We use Perceiver IO as a multimodal late fusion over unimodal streams to address both sub-tasks A and B. We build unimodal embeddings using Vision Transformer (image) and RoBERTa (text transcript). We enrich the input representation using face and demographic recognition, image captioning, and detection of adult content and web entities. To the best of our knowledge, this work is the first to use Perceiver IO combining text and image modalities. The proposed approach outperforms unimodal and multimodal baselines.

Anthology ID:: 2022.semeval-1.90
Volume:: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 654–662
Language:
URL:: https://preview.aclanthology.org/icon-24-ingestion/2022.semeval-1.90/
DOI:: 10.18653/v1/2022.semeval-1.90
Bibkey:
Cite (ACL):: Giuseppe Attanasio, Debora Nozza, and Federico Bianchi. 2022. MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 654–662, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities (Attanasio et al., SemEval 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/icon-24-ingestion/2022.semeval-1.90.pdf
Video:: https://preview.aclanthology.org/icon-24-ingestion/2022.semeval-1.90.mp4
Code: milanlproc/milanlp-at-mami
Data: FairFace, Hateful Memes

PDF Search Code Video Fix data