Yao Li
2025
Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation
Pengchao Feng
|
Ziyang Ma
|
Wenxi Chen
|
Yao Li
|
Sheng Wang
|
Kai Yu
|
Xie Chen
Findings of the Association for Computational Linguistics: EMNLP 2025
End-to-end speech-to-speech (S2S) dialogue systems have recently garnered increasing research attention for their lower latency and more natural integration of nonverbal cues such as emotion and speaker identity. However, these systems face key challenges, particularly in incorporating external knowledge, a capability commonly addressed by Retrieval-Augmented Generation (RAG) in text-based large language models (LLMs). The core difficulty lies in the modality gap between input speech and retrieved textual knowledge, which hinders effective integration of information. To address this issue, we propose a novel end-to-end RAG framework that directly retrieves relevant textual knowledge from speech queries. Experimental results demonstrate that our method significantly improves the performance of end-to-end S2S dialogue systems while achieving higher retrieval efficiency. Although the overall performance still lags behind the SOTA cascaded models, our framework offers a promising direction for enhancing knowledge integration in end-to-end S2S systems. Our code and dataset are released.
2022
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
Fan Yin
|
Yao Li
|
Cho-Jui Hsieh
|
Kai-Wei Chang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Adversarial Examples Detection (AED) is a crucial defense technique against adversarial attacks and has drawn increasing attention from the Natural Language Processing (NLP) community. Despite the surge of new AED methods, our studies show that existing methods heavily rely on a shortcut to achieve good performance. In other words, current search-based adversarial attacks in NLP stop once model predictions change, and thus most adversarial examples generated by those attacks are located near model decision boundaries. To surpass this shortcut and fairly evaluate AED methods, we propose to test AED methods with Far Boundary (FB) adversarial examples. Existing methods show worse than random guess performance under this scenario. To overcome this limitation, we propose a new technique, ADDMU, adversary detection with data and model uncertainty, which combines two types of uncertainty estimation for both regular and FB adversarial example detection. Our new method outperforms previous methods by 3.6 and 6.0 AUC points under each scenario. Finally, our analysis shows that the two types of uncertainty provided by ADDMU can be leveraged to characterize adversarialexamples and identify the ones that contribute most to model’s robustness in adversarial training.
2011
Combining Syntactic and Semantic Features by SVM for Unrestricted Coreference Resolution
Huiwei Zhou
|
Yao Li
|
Degen Huang
|
Yan Zhang
|
Chunlong Wu
|
Yuansheng Yang
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task
Search
Fix author
Co-authors
- Kai-Wei Chang 1
- Wenxi Chen 1
- Xie Chen 1
- Pengchao Feng 1
- Cho-Jui Hsieh 1
- show all...