Xinyu Tang


2025

pdf bib
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning
Xinyu Tang | Xiaolei Wang | Xin Zhao | Ji-Rong Wen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Zero-shot in-context learning (ZS-ICL) aims to conduct in-context learning (ICL) without using human-annotated demonstrations.Existing ZS-ICL methods either use large language models (LLMs) to generate (input, label) pairs as pseudo-demonstrations or leverage historical pseudo-demonstrations to help solve the current problem.They assume that all problems are from the same task and traverse them in a random order.However, in real-world scenarios, problems usually come from diverse tasks, and only a few belong to the same task.The random traversing order may generate unreliable pseudo-demonstrations and lead to error accumulation.To address this problem, we reformulate ZS-**ICL** as a planning problem and propose a **D**emonstration-**AW**are Mo**N**te Carlo Tree Search (MCTS) approach (DAWN-ICL), which leverages MCTS to strategically plan the problem-solving trajectories for ZS-ICL.In addition, to achieve effective and efficient Q value estimation, we propose a demonstration-aware Q-value function and use it to enhance the selection phase and accelerate the expansion and simulation phases in MCTS.Extensive experiments demonstrate the effectiveness and efficiency of DAWN-ICL on in-domain and cross-domain scenarios, and it even outperforms ICL using human-annotated demonstrations.The code is available at https://github.com/txy77/MCTS4ZSICL.

2023

pdf bib
Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models
Xiaolei Wang | Xinyu Tang | Xin Zhao | Jingyuan Wang | Ji-Rong Wen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs), which rely on natural language conversations to satisfy user needs. In this paper, we embark on an investigation into the utilization of ChatGPT for CRSs, revealing the inadequacy of the existing evaluation protocol. It might overemphasize the matching with ground-truth items annotated by humans while neglecting the interactive nature of CRSs. To overcome the limitation, we further propose an **i**nteractive **Eva**luation approach based on **L**L**M**s, named **iEvaLM**, which harnesses LLM-based user simulators. Our evaluation approach can simulate various system-user interaction scenarios. Through the experiments on two public CRS datasets, we demonstrate notable improvements compared to the prevailing evaluation protocol. Furthermore, we emphasize the evaluation of explainability, and ChatGPT showcases persuasive explanation generation for its recommendations. Our study contributes to a deeper comprehension of the untapped potential of LLMs for CRSs and provides a more flexible and realistic evaluation approach for future research about LLM-based CRSs.