MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore
Yingxu He, Zhuohan Liu, Geyu Lin, Shuo Sun, Bin Wang, Wenyu Zhang, Xunlong Zou, Nancy F. Chen, AiTi Aw
Abstract
We introduce MERaLiON-AudioLLM, the first general-purpose audio-based large language model designed for multitask learning, with a particular focus on Singlish understanding. Trained on 62 million multimodal instruction samples comprising a total of 260k hours of audio, it exhibits strong generalization across a diverse set of tasks, including—but not limited to—automatic speech recognition, spoken question answering, speech translation, and paralinguistic analysis. Our results show significant improvements in local speech recognition and task-specific understanding, making MERaLiON-AudioLLM a leading solution for region-specific AI applications. An interactive demo has been developed to enable user-friendly interactions, supported by a backend with customized caching and load-balancing mechanisms. We benchmark the model across a broad range of multilingual and multitask scenarios, where it demonstrates competitive performance compared to other open-source models. The demo page, model weights and videos are publically accessible.- Anthology ID:
- 2025.acl-demo.3
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Pushkar Mishra, Smaranda Muresan, Tao Yu
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 22–30
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.3/
- DOI:
- Cite (ACL):
- Yingxu He, Zhuohan Liu, Geyu Lin, Shuo Sun, Bin Wang, Wenyu Zhang, Xunlong Zou, Nancy F. Chen, and AiTi Aw. 2025. MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 22–30, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore (He et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.3.pdf