MUSCAT: MUltilingual, SCientific ConversATion Benchmark

Supriti Sinhamahapatra, Thai-Binh Nguyen, Yiğit Oğuz, Enes Yavuz Ugan, Jan Niehues, Alexander Waibel


Abstract
The goal of multilingual speech technology is to facilitate seamless communication between individuals speaking different languages, creating the experience as though everyone were a multilingual speaker. To create this experience, speech technology needs to address several challenges: Handling mixed multilingual input, specific vocabulary, and code-switching. However, there is currently no dataset benchmarking this situation. We propose a new benchmark to evaluate current Automatic Speech Recognition (ASR) systems, whether they are able to handle these challenges. The benchmark consists of bilingual discussions on scientific papers between multiple speakers, each conversing in a different language. We provide a standard evaluation framework, beyond Word Error Rate (WER) enabling consistent comparison of ASR performance across languages. Experimental results demonstrate that the proposed dataset is still an open challenge for state-of-the-art ASR systems. The dataset is available in https://huggingface.co/datasets/goodpiku/muscat-eval
Anthology ID:
2026.lrec-main.471
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
5926–5937
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.471/
DOI:
Bibkey:
Cite (ACL):
Supriti Sinhamahapatra, Thai-Binh Nguyen, Yiğit Oğuz, Enes Yavuz Ugan, Jan Niehues, and Alexander Waibel. 2026. MUSCAT: MUltilingual, SCientific ConversATion Benchmark. International Conference on Language Resources and Evaluation, main:5926–5937.
Cite (Informal):
MUSCAT: MUltilingual, SCientific ConversATion Benchmark (Sinhamahapatra et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.471.pdf