Sijia Yao
2026
SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection
Zhiyong Cao | Dunqiang Liu | Qi Dai | Haojun Xu | Huai Yuen Khor | Hao Wang | Huan He | Yafei Liu | Ke Ma | Ruqian Shi | Sicheng Zhou | Sijia Yao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Zhiyong Cao | Dunqiang Liu | Qi Dai | Haojun Xu | Huai Yuen Khor | Hao Wang | Huan He | Yafei Liu | Ke Ma | Ruqian Shi | Sicheng Zhou | Sijia Yao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Task-oriented proactive dialogue agents play a pivotal role in recruitment, particularly for steering conversations towards specific business outcomes, such as acquiring social-media contacts for private-channel conversion. Although supervised fine-tuning and reinforcement learning have proven effective for training such agents, their performance is heavily constrained by the scarcity of high-quality, goal-oriented domain-specific training data. To address this challenge, we propose SimRPD, a three-stage framework for training recruitment proactive dialogue agents. First, we develop a high-fidelity user simulator to synthesize large-scale conversational data through multi-turn online dialogue. Then we introduce a multi-dimensional evaluation framework based on Chain-of-Intention (CoI) to comprehensively assess the simulator and effectively select high-quality data, incorporating both global-level and instance-level metrics. Finally, we train the recruitment proactive dialogue agent on the selected dataset. Experiments in a real-world recruitment scenario demonstrate that SimRPD outperforms existing simulator-based data selection strategies, highlighting its practical value for industrial deployment and its potential applicability to other business-oriented dialogue scenarios.
2025
ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance
Sijia Yao | Pengcheng Huang | Zhenghao Liu | Yu Gu | Yukun Yan | Shi Yu | Ge Yu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Sijia Yao | Pengcheng Huang | Zhenghao Liu | Yu Gu | Yukun Yan | Shi Yu | Ge Yu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) have demonstrated significant potential in enhancing dense retrieval through query augmentation. However, most existing methods treat the LLM and the retriever as separate modules, overlooking the alignment between generation and ranking objectives. In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework that jointly optimizes both the LLM and the retriever. ExpandR employs the LLM to generate semantically rich query expansions, which are leveraged to enhance the retriever’s training. Simultaneously, the LLM is trained using Direct Preference Optimization (DPO), guided by a carefully designed reward function that balances retrieval effectiveness and generation consistency. This joint optimization paradigm enables mutual adaptation between the LLM and the retriever, resulting in query expansions that are both informative and well-suited for retrieval. Experimental results on multiple benchmarks show that ExpandR consistently outperforms strong baselines, achieving more than a 5% improvement in retrieval performance. All codes are available at https://github.com/NEUIR/ExpandR.