Yiwen Zhao
2026
Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents
Zeping Li | Hongru Wang | Yiwen Zhao | Guanhua Chen | Yixia Li | Keyang Chen | Yixin Cao | Guangnan Ye | Hongfeng Chai | Zhenfei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zeping Li | Hongru Wang | Yiwen Zhao | Guanhua Chen | Yixia Li | Keyang Chen | Yixin Cao | Guangnan Ye | Hongfeng Chai | Zhenfei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tool-using agents based on Large Language Models (LLMs) excel in tasks such as mathematical reasoning and multi-hop question answering. However, in long trajectories, agents often trigger excessive and low-quality tool calls, increasing latency and degrading inference performance, making managing tool-use behavior challenging. In this work, we conduct entropy-based pilot experiments and observe a strong positive correlation between entropy reduction and high-quality tool calls. Building on this finding, we propose using entropy reduction as a supervisory signal and design two reward strategies to address the differing needs of optimizing tool-use behavior. Sparse outcome rewards provide coarse, trajectory-level guidance to improve efficiency, while dense process rewards offer fine-grained supervision to enhance performance. Experiments across diverse domains show that both reward designs improve tool-use behavior: the former reduces tool calls by 72.07% compared to the average of baselines, while the latter improves performance by 22.27%. These results position entropy reduction as a key mechanism for enhancing tool-use behavior, enabling agents to be more adaptive in real-world applications.
2025
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Jiatong Shi | Hye-jin Shim | Jinchuan Tian | Siddhant Arora | Haibin Wu | Darius Petermann | Jia Qi Yip | You Zhang | Yuxun Tang | Wangyou Zhang | Dareen Safar Alharthi | Yichen Huang | Koichi Saito | Jionghao Han | Yiwen Zhao | Chris Donahue | Shinji Watanabe
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
Jiatong Shi | Hye-jin Shim | Jinchuan Tian | Siddhant Arora | Haibin Wu | Darius Petermann | Jia Qi Yip | You Zhang | Yuxun Tang | Wangyou Zhang | Dareen Safar Alharthi | Yichen Huang | Koichi Saito | Jionghao Han | Yiwen Zhao | Chris Donahue | Shinji Watanabe
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
In this work, we introduce VERSA, a unified and standardized evaluation toolkit designed for various speech, audio, and music signals. The toolkit features a Pythonic interface with flexible configuration and dependency control, making it user-friendly and efficient. With full installation, VERSA offers 65 metrics with 729 metric variations based on different configurations. These metrics encompass evaluations utilizing diverse external resources, including matching and non-matching reference audio, text transcriptions, and text captions. As a lightweight yet comprehensive toolkit, VERSA is versatile to support the evaluation of a wide range of downstream scenarios. To demonstrate its capabilities, this work highlights example use cases for VERSA, including audio coding, speech synthesis, speech enhancement, singing synthesis, and music generation. The toolkit is available at https://github.com/shinjiwlab/versa.
ESPnet-SpeechLM: An Open Speech Language Model Toolkit
Jinchuan Tian | Jiatong Shi | William Chen | Siddhant Arora | Yoshiki Masuyama | Takashi Maekaku | Yihan Wu | Junyi Peng | Shikhar Bharadwaj | Yiwen Zhao | Samuele Cornell | Yifan Peng | Xiang Yue | Chao-Han Huck Yang | Graham Neubig | Shinji Watanabe
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
Jinchuan Tian | Jiatong Shi | William Chen | Siddhant Arora | Yoshiki Masuyama | Takashi Maekaku | Yihan Wu | Junyi Peng | Shikhar Bharadwaj | Yiwen Zhao | Samuele Cornell | Yifan Peng | Xiang Yue | Chao-Han Huck Yang | Graham Neubig | Shinji Watanabe
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
We present ESPnet-SpeechLM, an open toolkit designed to democratize the development of speech language models (SpeechLMs) and voice-driven agentic applications. The toolkit standardizes speech processing tasks by framing them as universal sequential modeling problems, encompassing a cohesive workflow of data preprocessing, pre-training, inference, and task evaluation. With ESPnet-SpeechLM, users can easily define task templates and configure key settings, enabling seamless and streamlined SpeechLM development. The toolkit ensures flexibility, efficiency, and scalability by offering highly configurable modules for every stage of the workflow. To illustrate its capabilities, we provide multiple use cases demonstrating how competitive SpeechLMs can be constructed with ESPnet-SpeechLM, including a 1.7B-parameter model pre-trained on both text and speech tasks, across diverse benchmarks. The toolkit and its recipes are fully transparent and reproducible at: https://github.com/espnet/espnet/tree/speechlm.
Search
Fix author
Co-authors
- Siddhant Arora 2
- Jiatong Shi 2
- Jinchuan Tian 2
- Shinji Watanabe 2
- Dareen Safar Alharthi 1
- Shikhar Bharadwaj 1
- Yixin Cao 1
- Hongfeng Chai (柴洪峰) 1
- Guanhua Chen 1
- Keyang Chen 1
- William Chen 1
- Samuele Cornell 1
- Chris Donahue 1
- Jionghao Han 1
- Yichen Huang 1
- Yixia Li 1
- Zeping Li 1
- Takashi Maekaku 1
- Yoshiki Masuyama 1
- Graham Neubig 1
- Junyi Peng 1
- Yifan Peng 1
- Darius Petermann 1
- Koichi Saito 1
- Hye-jin Shim 1
- Yuxun Tang 1
- Hongru Wang 1
- Haibin Wu 1
- Yihan Wu 1
- Chao-Han Huck Yang 1
- Guangnan Ye (叶广楠) 1
- Zhenfei Yin 1
- Jia Qi Yip 1
- Xiang Yue 1
- Wangyou Zhang 1
- You Zhang 1