SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation

Yizhe Zhang, Navdeep Jaitly


Abstract
Recent advances in large language models have enabled impressive task-oriented applications, yet building emotionally intelligent chatbots for natural, strategic conversations remains challenging. Current approaches often assume a single “ground truth” for emotional responses, overlooking the subjectivity of human emotion. We present a novel perspectivist approach, SAGE, that models multiple perspectives in dialogue generation using latent variables. At its core is the State-Action Chain (SAC), which augments standard fine-tuning with latent variables capturing diverse emotional states and conversational strategies between turns, in a future-looking manner. During inference, these variables are generated before each response, enabling multi-perspective control while preserving natural interactions. We also introduce a self-improvement pipeline combining dialogue tree search, LLM-based reward modeling, and targeted fine-tuning to optimize conversational trajectories. Experiments show improved LLM-based judgments while maintaining strong general LLM performance. The discrete latent variables further enable search-based strategies and open avenues for state-level reinforcement learning in dialogue systems, where learning can occur at the state level rather than the token level.
Anthology ID:
2025.nlperspectives-1.11
Volume:
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Gavin Abercrombie, Valerio Basile, Simona Frenda, Sara Tonelli, Shiran Dudy
Venues:
NLPerspectives | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
123–132
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.nlperspectives-1.11/
DOI:
Bibkey:
Cite (ACL):
Yizhe Zhang and Navdeep Jaitly. 2025. SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation. In Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP, pages 123–132, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation (Zhang & Jaitly, NLPerspectives 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.nlperspectives-1.11.pdf