ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues

Yikuan Hu, Chen Huang, Wenqiang Lei


Abstract
Non-cooperative dialogues, such as negotiations and persuasion, present significant challenges for large language models (LLMs) due to the lack of inherent cooperation or shared goals. Current methods for optimizing dialogue strategies require substantial human effort for strategy optimization. To address these challenges, we propose ASTRO (Automated Strategy Optimization), a fully automated solution that leverages LLMs’ self-envolving capabilities. ASTRO dynamically generates customized strategy sets based on task goals and optimizes strategy planner using a self-play reinforcement learning paradigm. Our experimental results demonstrate ASTRO’s significant performance improvements over baseline models across various non-cooperative dialogue tasks, highlighting the potential for autonomously developing such agents without human intervention. Our code is available at https://github.com/SCUNLP/ASTRO.
Anthology ID:
2025.findings-acl.22
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
388–408
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.22/
DOI:
Bibkey:
Cite (ACL):
Yikuan Hu, Chen Huang, and Wenqiang Lei. 2025. ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues. In Findings of the Association for Computational Linguistics: ACL 2025, pages 388–408, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues (Hu et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.22.pdf