Yuhao Du
Other people with similar names: Yuhao Du
2026
S2S-Arena: Evaluating Paralinguistic Instruction Following in Speech-to-Speech Models
Feng Jiang | Zhiyu Lin | Yiyang Liu | Liumeng Xue | Fan Bu | Yuhao Du | Xiangying Chen | Benyou Wang | Haizhou Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Feng Jiang | Zhiyu Lin | Yiyang Liu | Liumeng Xue | Fan Bu | Yuhao Du | Xiangying Chen | Benyou Wang | Haizhou Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advances in large language models (LLMs) have fundamentally reshaped speech-to-speech (S2S) systems, enabling increasingly natural spoken interaction. However, existing benchmarks still rely heavily on text-based evaluation and largely ignore paralinguistic cues such as prosody, emotion, and speaker traits, which are central to expressive and human-like communication. We introduce S2S-Arena, a speech-native benchmark for evaluating instruction-following S2S models with explicit assessment of both semantic understanding and paralinguistic expression. S2S-Arena features a four-level interaction protocol that systematically probes models under increasing paralinguistic complexity, a two-stage data construction pipeline that produces 1,243 speech samples spanning 100+ real-world tasks, and an arena-style evaluation framework that enables reference-free, pairwise comparison directly in the speech modality. Benchmarking 10 state-of-the-art S2S systems over 1,000+ comparisons reveals substantial performance gaps (especially under complex paralinguistic demands) between current academic and industrial systems. Our analysis further identifies key design factors governing expressive instruction following, providing actionable insights for building more natural, robust, and human-aligned speech agents.
2025
Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion
Jianqing Zhu | Huang Huang | Zhihang Lin | Juhao Liang | Zhengyang Tang | Khalid Almubarak | Mosen Alharthi | Bang An | Juncai He | Xiangbo Wu | Fei Yu | Junying Chen | Ma Zhuoheng | Yuhao Du | He Zhang | Saied Alshahrani | Emad A. Alghamdi | Lian Zhang | Ruoyu Sun | Haizhou Li | Benyou Wang | Jinchao Xu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jianqing Zhu | Huang Huang | Zhihang Lin | Juhao Liang | Zhengyang Tang | Khalid Almubarak | Mosen Alharthi | Bang An | Juncai He | Xiangbo Wu | Fei Yu | Junying Chen | Ma Zhuoheng | Yuhao Du | He Zhang | Saied Alshahrani | Emad A. Alghamdi | Lian Zhang | Ruoyu Sun | Haizhou Li | Benyou Wang | Jinchao Xu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art offerings like GPT-4 or GPT-3.5, due to a predominant focus on mainstream languages (e.g., English and Chinese). One practical objective for Arabic LLMs is to utilize Arabic-specific vocabulary in the tokenizer to accelerate decoding. However, using a different vocabulary often leads to degradation of the model’s learned knowledge, since many words become out-of-vocabulary (OOV) at the beginning of training. Inspired by the vocabulary learning during Second Language (Arabic) Acquisition for humans, the released AraLLaMA employs progressive vocabulary expansion, which is implemented by a modified BPE algorithm that progressively extends the Arabic subwords in its dynamic vocabulary during training, thereby balancing the OOV ratio at every stage. The ablation study demonstrated the effectiveness of Progressive Vocabulary Expansion.Moreover, AraLLaMA achieves decent performance comparable to the best Arabic LLMs across a variety of Arabic benchmarks. Our model weights are available at: https://github.com/FreedomIntelligence/AraLLaMa.