Shahad Abir
2026
CYBERPUNK@DravidianLangTech 2026: Multimodal Political Meme Classification using CLIP and Logo Similarity
Shahad Abir
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Shahad Abir
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
We present our system for the DravidianLangTech 2026 shared task on multi-level political meme classification in Tamil and Malayalam. The task involves two hierarchical levels: (1) stance detection (Support vs. Troll) and (2) target identification (Person, Party, or Intersection). Our approach combines CLIP vision-language embeddings (ViT-L-14) with face detection features and political logo similarity matching, resulting in a 773-dimensional feature representation. We train separate LinearSVC classifiers for each language and task level. Our system achieved Rank 1 in Malayalam with an average F1-score of 0.7930 and Rank 6 in Tamil with 0.7666. Our codes are available at https://github.com/A-k-a-sh/Shared-task-multimodal-political-meme.