Xiaoxue Gao
2026
MMAC: A Multilingual, Multimodal Alignment Framework for Cultural Grounding Evaluation
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The global deployment of Large Language Models (LLMs) underscores the urgent need to evaluate their cultural alignment. However, assessing genuine "cultural awareness" across modalities (text, vision, speech) and languages remains a significant challenge. To comprehensively investigate this domain, we propose MMAC, a systematic framework that encompasses a tri-modally aligned cultural benchmark creation pipeline and a five-dimensional evaluation protocol to assess cross-country awareness disparities, evaluate cross-lingual and cross-modal consistency, and verify cultural knowledge generalization and grounding validity. Given the prevailing Western cultural bias in current models, we focus on 8 Asian countries as our dataset foundation to more acutely reveal potential cultural deficiencies in LLMs. Our dataset, MMAC-bench, features 27,000 human-curated questions across 10 languages. Crucially, it is the first dataset aligned at the input level across text, image, and speech, enabling direct cross-modal transfer tests. Each question consists of multiple-choice options accompanied by open-ended generated explanations, where 79% require multi-step reasoning grounded in cultural context, moving beyond simple memorization. We probe the causes of modal divergence, offering insights into fostering culturally robust MLLMs.
2025
SingaKids: A Multilingual Multimodal Dialogic Tutor for Language Learning
Zhengyuan Liu | Geyu Lin | Hui Li Tan | Huayun Zhang | Yanfeng Lu | Xiaoxue Gao | Stella Xin Yin | Sun He | Hock Huan Goh | Lung Hsiang Wong | Nancy F. Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Zhengyuan Liu | Geyu Lin | Hui Li Tan | Huayun Zhang | Yanfeng Lu | Xiaoxue Gao | Stella Xin Yin | Sun He | Hock Huan Goh | Lung Hsiang Wong | Nancy F. Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
The integration of generative artificial intelligence into educational applications has enhanced personalized and interactive learning experiences, and it shows strong potential to promote young learners language acquisition. However, it is still challenging to ensure consistent and robust performance across different languages and cultural contexts, and kids-friendly design requires simplified instructions, engaging interactions, and age-appropriate scaffolding to maintain motivation and optimize learning outcomes.In this work, we introduce SingaKids, a dialogic tutor designed to facilitate language learning through picture description tasks. Our system integrates dense image captioning, multilingual dialogic interaction, speech understanding, and engaging speech generation to create an immersive learning environment in four languages: English, Mandarin, Malay, and Tamil. We further improve the system through multilingual pre-training, task-specific tuning, and scaffolding optimization. Empirical studies with elementary school students demonstrate that SingaKids provides effective dialogic teaching, benefiting learners at different performance levels.
2024
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Yiming Chen | Xianghu Yue | Xiaoxue Gao | Chen Zhang | Luis Fernando D’Haro | Robby T. Tan | Haizhou Li
Findings of the Association for Computational Linguistics: EMNLP 2024
Yiming Chen | Xianghu Yue | Xiaoxue Gao | Chen Zhang | Luis Fernando D’Haro | Robby T. Tan | Haizhou Li
Findings of the Association for Computational Linguistics: EMNLP 2024
Various audio-LLMs (ALLMs) have been explored recently for tackling different audio tasks simultaneously using a single, unified model. While existing evaluations of ALLMs primarily focus on single-audio tasks, real-world applications often involve processing multiple audio streams simultaneously. To bridge this gap, we propose the first multi-audio evaluation (MAE) benchmark that consists of 20 datasets from 11 multi-audio tasks encompassing both speech and sound scenarios. Comprehensive experiments on MAE demonstrate that the existing ALLMs, while being powerful in comprehending primary audio elements in individual audio inputs, struggling to handle multi-audio scenarios. To this end, we propose a novel multi-audio-LLM (MALLM) to capture audio context among multiple similar audios using discriminative learning on our proposed synthetic data. The results demonstrate that the proposed MALLM outperforms all baselines and achieves high data efficiency using synthetic data without requiring human annotations. The proposed MALLM opens the door for ALLMs towards multi-audio processing era and brings us closer to replicating human auditory capabilities in machines.
Search
Fix author
Co-authors
- Nancy Chen 2
- Zhengyuan Liu 2
- Tanmoy Chakraborty 1
- Yiming Chen 1
- Pham Minh Duc 1
- Luis Fernando D’Haro 1
- Hock Huan Goh 1
- Sun He 1
- Yujia Hu 1
- Koji Inoue 1
- Jimin Jung 1
- Wiwik Karlina 1
- Tatsuya Kawahara 1
- Dongjun Kim 1
- Roy Ka-Wei Lee 1
- Long Li 1
- Haizhou Li 1
- Geyu Lin 1
- Chang Liu 1
- Rui Liu 1
- Huiyao Liu 1
- Yanfeng Lu 1
- Tuan Luong 1
- Palash Nandi 1
- Ojasva Saxena 1
- Jaehyung Seo 1
- Ryuichi Sumida 1
- Bryan Chen Zhengyu Tan 1
- Robby T. Tan 1
- Hui Li Tan 1
- Xiyan Tao 1
- Wei Tian 1
- Keertana Arun Vasan 1
- Chaojun Wang 1
- Nadya Yuki Wangsajaya 1
- Lung Hsiang Wong 1
- Xing Xie 1
- Weiwen Xu 1
- Fan Xu (徐凡) 1
- Jing Yao 1
- Lingyu Ye 1
- Xiaoyuan Yi 1
- Stella Xin Yin 1
- Xianghu Yue 1
- Chen Zhang 1
- Huayun Zhang 1
- Weihua Zheng 1
- Bowei Zou (邹博伟) 1