Tianchi Liu
2026
Evaluating the Expressive Appropriateness of Speech in Rich Contexts
Tianrui Wang | Ziyang Ma | Yizhou Peng | Haoyu Wang | Zhikang Niu | Zikang Huang | Yihao Wu | Yi-Wen Chao | Yu Jiang | Yuheng Lu | Guanrou Yang | Xuanchen Li | Hexin Liu | Chunyu Qiang | Cheng Gong | Yifan Yang | Tianchi Liu | Junyu Wang | Nana Hou | Meng Ge | Fuming You | Yang Wei | Zhongqian Sun | Hu Haifeng | Xiaobao Wang | Eng Siong Chng | Xie Chen | Longbiao Wang | Jianwu Dang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tianrui Wang | Ziyang Ma | Yizhou Peng | Haoyu Wang | Zhikang Niu | Zikang Huang | Yihao Wu | Yi-Wen Chao | Yu Jiang | Yuheng Lu | Guanrou Yang | Xuanchen Li | Hexin Liu | Chunyu Qiang | Cheng Gong | Yifan Yang | Tianchi Liu | Junyu Wang | Nana Hou | Meng Ge | Fuming You | Yang Wei | Zhongqian Sun | Hu Haifeng | Xiaobao Wang | Eng Siong Chng | Xie Chen | Longbiao Wang | Jianwu Dang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Evaluating expressive speech remains challenging, as existing methods mainly assess emotional intensity and overlook whether a speech sample is expressively appropriate for its contextual setting. This limitation hinders reliable evaluation of speech systems used in narrative-driven and interactive applications, such as audiobooks and conversational agents. We introduce CEAEval, a Context-rich framework for Evaluating Expressive Appropriateness in speech, which assesses whether a speech sample expressively aligns with the underlying communicative intent implied by its discourse-level narrative context. To support this task, we construct CEAEval-D, the first context-rich speech dataset with real human performances in Mandarin conversational speech, providing narrative descriptions together with fifteen dimensions of human annotations covering expressive attributes and expressive appropriateness. We further develop CEAEval-M, a model that integrates knowledge distillation, planner-based multi-model collaboration, adaptive audio attention bias, and reinforcement learning to perform context-rich expressive appropriateness evaluation. Experiments on a human-annotated test set demonstrate that CEAEval-M substantially outperforms existing speech evaluation and analysis systems.
2025
Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
Qiongqiong Wang | Hardik Bhupendra Sailor | Tianchi Liu | Wenyu Zhang | Muhammad Huzaifah | Nattadaporn Lertcheva | Shuo Sun | Nancy F. Chen | Jinyang Wu | AiTi Aw
Findings of the Association for Computational Linguistics: EMNLP 2025
Qiongqiong Wang | Hardik Bhupendra Sailor | Tianchi Liu | Wenyu Zhang | Muhammad Huzaifah | Nattadaporn Lertcheva | Shuo Sun | Nancy F. Chen | Jinyang Wu | AiTi Aw
Findings of the Association for Computational Linguistics: EMNLP 2025
Recent speech-LLMs have shown impressive performance in tasks like transcription and translation, yet they remain limited in understanding the paralinguistic aspects of speech crucial for social and emotional intelligence. We propose CP-Bench, a benchmark for evaluating speech-LLMs on contextual paralinguistic reasoning the integration of verbal content with non-verbal cues like emotion and prosody. The benchmark includes two curated question answering (QA) datasets requiring both linguistic and empathetic understanding. We evaluate state-of-the-art speech-LLMs from both open and closed-source models and perform a comprehensive analysis across different question types. The top two models were further analyzed under temperature tuning to understand its effect on this task. Our benchmark reveals a key gap in existing evaluations and offers insights into building more context-aware and emotionally intelligent speech-capable LLMs.
Search
Fix author
Co-authors
- Aiti Aw 1
- Yi-Wen Chao 1
- Xie Chen 1
- Nancy Chen 1
- Eng Siong Chng 1
- Jianwu Dang 1
- Meng Ge 1
- Cheng Gong 1
- Hu Haifeng 1
- Nana Hou 1
- Zikang Huang 1
- Muhammad Huzaifah 1
- Yu Jiang 1
- Nattadaporn Lertcheva 1
- Xuanchen Li 1
- Hexin Liu 1
- Yuheng Lu 1
- Ziyang Ma 1
- Zhikang Niu 1
- Yizhou Peng 1
- Chunyu Qiang 1
- Hardik Bhupendra Sailor 1
- Zhongqian Sun 1
- Shuo Sun 1
- Tianrui Wang 1
- Haoyu Wang 1
- Junyu Wang 1
- Xiaobao Wang 1
- Longbiao Wang 1
- Qiongqiong Wang 1
- Yang Wei 1
- Yihao Wu 1
- Jinyang Wu 1
- Guanrou Yang 1
- Yifan Yang 1
- Fuming You 1
- Wenyu Zhang 1