Supriya Chanda


2026

Recent advances in text-to-image generation have enabled automated visual storytelling, yet most existing datasets remain monolingual and culturally narrow. We introduce MUSIA, a Multilingual Story Illustration Corpus designed to advance research in cross-lingual and culturally grounded narrative illustration. MUSIA comprises bilingual (English-Hindi) story-image pairs drawn from open literary and folk sources, curated to reflect diverse cultural themes, artistic styles, and linguistic structures. Each story includes multiple illustrations aligned at the scene level, accompanied by quality-verified mappings for narrative-visual coherence. To establish a reproducible benchmark, we propose a two-stage baseline combining transformer-based semantic summarization with diffusion-based image generation, achieving strong performance in relevance, visual quality, and consistency. MUSIA represents the first step toward a scalable, culturally inclusive benchmark for multilingual visual storytelling, enabling fair and reproducible research across low-resource and underrepresented languages.

2020

This paper describes the IRlab@IIT-BHU system for the OffensEval 2020. We take the SVM with TF-IDF features to identify and categorize hate speech and offensive language in social media for two languages. In subtask A, we used a linear SVM classifier to detect abusive content in tweets, achieving a macro F1 score of 0.779 and 0.718 for Arabic and Greek, respectively.
This paper reports our submission to the shared Task 2: Identification of informative COVID-19 English tweets at W-NUT 2020. We attempted a few techniques, and we briefly explain here two models that showed promising results in tweet classification tasks: DistilBERT and FastText. DistilBERT achieves a F1 score of 0.7508 on the test set, which is the best of our submissions.