Ming Shan Hee


2025

pdf bib
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
Samuel Cahyawijaya | Holy Lovenia | Joel Ruben Antony Moniz | Tack Hwa Wong | Mohammad Rifqi Farhansyah | Thant Thiri Maung | Frederikus Hudi | David Anugraha | Muhammad Ravi Shulthan Habibi | Muhammad Reza Qorib | Amit Agarwal | Joseph Marvin Imperial | Hitesh Laxmichand Patel | Vicky Feliren | Bahrul Ilmi Nasution | Manuel Antonio Rufino | Genta Indra Winata | Rian Adam Rajagede | Carlos Rafael Catalan | Mohamed Fazli Mohamed Imam | Priyaranjan Pattnayak | Salsabila Zahirah Pranida | Kevin Pratama | Yeshil Bangera | Adisai Na-Thalang | Patricia Nicole Monderin | Yueqi Song | Christian Simon | Lynnette Hui Xian Ng | Richardy Lobo Sapan | Taki Hasan Rafi | Bin Wang | Supryadi | Kanyakorn Veerakanjana | Piyalitt Ittichaiwong | Matthew Theodore Roque | Karissa Vincentio | Takdanai Kreangphet | Phakphum Artkaew | Kadek Hendrawan Palgunadi | Yanzhi Yu | Rochana Prih Hastuti | William Nixon | Mithil Bangera | Adrian Xuan Wei Lim | Aye Hninn Khine | Hanif Muhammad Zhafran | Teddy Ferdinan | Audra Aurora Izzani | Ayushman Singh | Evan Evan | Jauza Akbar Krito | Michael Anugraha | Fenal Ashokbhai Ilasariya | Haochen Li | John Amadeo Daniswara | Filbert Aurelian Tjiaranata | Eryawan Presma Yulianrifat | Can Udomcharoenchaikit | Fadil Risdian Ansori | Mahardika Krisna Ihsani | Giang Nguyen | Anab Maulana Barik | Dan John Velasco | Rifo Ahmad Genadi | Saptarshi Saha | Chengwei Wei | Isaiah Edri W. Flores | Kenneth Chen Ko Han | Anjela Gail D. Santos | Wan Shen Lim | Kaung Si Phyo | Tim Santos | Meisyarah Dwiastuti | Jiayun Luo | Jan Christian Blaise Cruz | Ming Shan Hee | Ikhlasul Akmal Hanif | M.Alif Al Hakim | Muhammad Rizky Sya’ban | Kun Kerdthaisong | Lester James Validad Miranda | Fajri Koto | Tirana Noor Fatyanosa | Alham Fikri Aji | Jostin Jerico Rosal | Jun Kevin | Robert Wijaya | Onno P. Kampman | Ruochen Zhang | Börje F. Karlsson | Peerat Limkonchotiwat
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite Southeast Asia’s (SEA) extraordinary linguistic and cultural diversity, the region remains significantly underrepresented in vision-language (VL) research, resulting in AI models that inadequately capture SEA cultural nuances. To fill this gap, we present SEA-VL, an open-source initiative dedicated to developing culturally relevant high-quality datasets for SEA languages. By involving contributors from SEA countries, SEA-VL ensures better cultural relevance and diversity, fostering greater inclusivity of underrepresented languages and cultural depictions in VL research. Our methodology employed three approaches: community-driven crowdsourcing with SEA contributors, automated image crawling, and synthetic image generation. We evaluated each method’s effectiveness in capturing cultural relevance. We found that image crawling achieves approximately ~85% cultural relevance while being more cost- and time-efficient than crowdsourcing, whereas synthetic image generation failed to accurately reflect SEA cultural nuances and contexts. Collectively, we gathered 1.28 million SEA culturally relevant images, more than 50 times larger than other existing datasets. This work bridges the representation gap in SEA, establishes a foundation for developing culturally aware AI systems for this region, and provides a replicable framework for addressing representation gaps in other underrepresented regions.

2024

pdf bib
Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning
Ming Shan Hee | Aditi Kumaresan | Roy Ka-Wei Lee
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The widespread presence of hate speech on the internet, including formats such as text-based tweets and multimodal memes, poses a significant challenge to digital platform safety. Recent research has developed detection models tailored to specific modalities; however, there is a notable gap in transferring detection capabilities across different formats. This study conducts extensive experiments using few-shot in-context learning with large language models to explore the transferability of hate speech detection between modalities. Our findings demonstrate that text-based hate speech examples can significantly enhance the classification accuracy of vision-language hate speech. Moreover, text-based demonstrations outperform vision-language demonstrations in few-shot learning settings. These results highlight the effectiveness of cross-modality knowledge transfer and offer valuable insights for improving hate speech detection systems.

pdf bib
Recent Advances in Online Hate Speech Moderation: Multimodality and the Role of Large Models
Ming Shan Hee | Shivam Sharma | Rui Cao | Palash Nandi | Preslav Nakov | Tanmoy Chakraborty | Roy Ka-Wei Lee
Findings of the Association for Computational Linguistics: EMNLP 2024

Moderating hate speech (HS) in the evolving online landscape is a complex challenge, compounded by the multimodal nature of digital content. This survey examines recent advancements in HS moderation, focusing on the burgeoning role of large language models (LLMs) and large multimodal models (LMMs) in detecting, explaining, debiasing, and countering HS. We begin with a comprehensive analysis of current literature, uncovering how text, images, and audio interact to spread HS. The combination of these modalities adds complexity and subtlety to HS dissemination. We also identified research gaps, particularly in underrepresented languages and cultures, and highlight the need for solutions in low-resource settings. The survey concludes with future research directions, including novel AI methodologies, ethical AI governance, and the development of context-aware systems. This overview aims to inspire further research and foster collaboration towards responsible and human-centric approaches to HS moderation in the digital age.

pdf bib
SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore
Ri Chi Ng | Nirmalendu Prakash | Ming Shan Hee | Kenny Tsu Wei Choo | Roy Ka-wei Lee
Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024)

To address the limitations of current hate speech detection models, we introduce SGHateCheck, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore’s main languages, and refining these with native annotators. SGHateCheck reveals critical flaws in state-of-the-art models, highlighting their inadequacy in sensitive content moderation. This work aims to foster the development of more effective hate speech detection tools for diverse linguistic environments, particularly for Singapore and Southeast Asia contexts.
Search
Co-authors
Fix author