2025
pdf
bib
abs
ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning via Tool-integrated Action for Dynamic Offer Optimization
Deuksin Kwon
|
Jiwon Hae
|
Emma Clift
|
Daniel Shamsoddini
|
Jonathan Gratch
|
Gale Lucas
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Negotiation requires dynamically balancing self-interest and cooperation within the flow of conversation to maximize one’s own utility. Yet, existing agents struggle due to bounded rationality in human data, low adaptability to counterpart behavior, and limited strategic reasoning. To address this, we introduce principle-driven negotiation agents, powered by ASTRA, a novel framework for turn-level offer optimization grounded in two core principles: opponent modeling and Tit-for-Tat reciprocity. ASTRA operates in three stages: (1) interpreting counterpart behavior, (2) optimizing counteroffers via a tool-integrated action with a linear programming (LP) solver, and (3) selecting offers based on strategy assessment and the partner’s acceptance probability. Through simulations and human evaluations, our agent effectively adapts to an opponent’s shifting stance and achieves favorable outcomes through enhanced adaptability and strategic reasoning. Beyond enhancing negotiation performance, it also serves as a powerful coaching tool, offering interpretable strategic feedback and optimal offer recommendations beyond human bounded rationality, with its potential further validated through human evaluation.
pdf
bib
abs
Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans
Deuksin Kwon
|
Kaleen Shrestha
|
Bin Han
|
Elena Hayoung Lee
|
Gale Lucas
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) are increasingly deployed in socially complex, interaction-driven tasks, yet their ability to mirror human behavior in emotionally and strategically complex contexts remains underexplored. This study assesses the behavioral alignment of personality-prompted LLMs in adversarial dispute resolution by simulating multi-turn conflict dialogues that incorporate negotiation. Each LLM is guided by a matched Five-Factor personality profile to control for individual variation and enhance realism. We evaluate alignment across three dimensions: linguistic style, emotional expression (e.g., anger dynamics), and strategic behavior. GPT-4.1 achieves the closest alignment with humans in linguistic style and emotional dynamics, while Claude-3.7-Sonnet best reflects strategic behavior. Nonetheless, substantial alignment gaps persist. Our findings establish a benchmark for alignment between LLMs and humans in socially complex interactions, underscoring both the promise and the limitations of personality conditioning in dialogue modeling.
pdf
bib
abs
Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
Yunzhe Wang
|
Gale Lucas
|
Burcin Becerik-Gerber
|
Volkan Ustun
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Language-driven generative agents have enabled large-scale social simulations with transformative uses, from interpersonal training to aiding global policy-making. However, recent studies indicate that generative agent behaviors often deviate from expert expectations and real-world data—a phenomenon we term the *Behavior-Realism Gap*. To address this, we introduce a theoretical framework called Persona-Environment Behavioral Alignment (PEBA), formulated as a distribution matching problem grounded in Lewin’s behavior equation stating that behavior is a function of the person and their environment. Leveraging PEBA, we propose PersonaEvolve (PEvo), an LLM-based optimization algorithm that iteratively refines agent personas, implicitly aligning their collective behaviors with realistic expert benchmarks within a specified environmental context. We validate PEvo in an active shooter incident simulation we developed, achieving an 84% average reduction in distributional divergence compared to no steering and a 34% improvement over explicit instruction baselines. Results also show PEvo-refined personas generalize to novel, related simulation scenarios. Our method greatly enhances behavioral realism and reliability in high-stakes social simulations. More broadly, the PEBA-PEvo framework provides a principled approach to developing trustworthy LLM-driven social simulations.
2024
pdf
bib
abs
Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues
Deuksin Kwon
|
Emily Weiss
|
Tara Kulshrestha
|
Kushal Chawla
|
Gale Lucas
|
Jonathan Gratch
Findings of the Association for Computational Linguistics: EMNLP 2024
A successful negotiation requires a range of capabilities, including comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner’s motives, strategic reasoning, and effective communication, making it challenging for automated systems. Despite the remarkable performance of LLMs in various NLP tasks, there is no systematic evaluation of their capabilities in negotiation. Such an evaluation is critical for advancing AI negotiation agents and negotiation research, ranging from designing dialogue systems to providing pedagogical feedback and scaling up data collection practices. This work aims to systematically analyze the multifaceted capabilities of LLMs across diverse dialogue scenarios throughout the stages of a typical negotiation interaction. Our analysis highlights GPT-4’s superior performance in many tasks while identifying specific challenges, such as making subjective assessments and generating contextually appropriate, strategically advantageous responses.
2023
pdf
bib
abs
Social Influence Dialogue Systems: A Survey of Datasets and Models For Social Influence Tasks
Kushal Chawla
|
Weiyan Shi
|
Jingwen Zhang
|
Gale Lucas
|
Zhou Yu
|
Jonathan Gratch
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Dialogue systems capable of social influence such as persuasion, negotiation, and therapy, are essential for extending the use of technology to numerous realistic scenarios. However, existing research primarily focuses on either task-oriented or open-domain scenarios, a categorization that has been inadequate for capturing influence skills systematically. There exists no formal definition or category for dialogue systems with these skills and data-driven efforts in this direction are highly limited. In this work, we formally define and introduce the category of social influence dialogue systems that influence users’ cognitive and emotional responses, leading to changes in thoughts, opinions, and behaviors through natural conversations. We present a survey of various tasks, datasets, and methods, compiling the progress across seven diverse domains. We discuss the commonalities and differences between the examined systems, identify limitations, and recommend future directions. This study serves as a comprehensive reference for social influence dialogue systems to inspire more dedicated research and discussion in this emerging area.
pdf
bib
abs
Be Selfish, But Wisely: Investigating the Impact of Agent Personality in Mixed-Motive Human-Agent Interactions
Kushal Chawla
|
Ian Wu
|
Yu Rong
|
Gale Lucas
|
Jonathan Gratch
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
A natural way to design a negotiation dialogue system is via self-play RL: train an agent that learns to maximize its performance by interacting with a simulated user that has been designed to imitate human-human dialogue data. Although this procedure has been adopted in prior work, we find that it results in a fundamentally flawed system that fails to learn the value of compromise in a negotiation, which can often lead to no agreements (i.e., the partner walking away without a deal), ultimately hurting the model’s overall performance. We investigate this observation in the context of DealOrNoDeal task, a multi-issue negotiation over books, hats, and balls. Grounded in negotiation theory from Economics, we modify the training procedure in two novel ways to design agents with diverse personalities and analyze their performance with human partners. We find that although both techniques show promise, a selfish agent, which maximizes its own performance while also avoiding walkaways, performs superior to other variants by implicitly learning to generate value for both itself and the negotiation partner. We discuss the implications of our findings for what it means to be a successful negotiation dialogue system and how these systems should be designed in the future.
2022
pdf
bib
abs
Opponent Modeling in Negotiation Dialogues by Related Data Adaptation
Kushal Chawla
|
Gale Lucas
|
Jonathan May
|
Jonathan Gratch
Findings of the Association for Computational Linguistics: NAACL 2022
Opponent modeling is the task of inferring another party’s mental state within the context of social interactions. In a multi-issue negotiation, it involves inferring the relative importance that the opponent assigns to each issue under discussion, which is crucial for finding high-value deals. A practical model for this task needs to infer these priorities of the opponent on the fly based on partial dialogues as input, without needing additional annotations for training. In this work, we propose a ranker for identifying these priorities from negotiation dialogues. The model takes in a partial dialogue as input and predicts the priority order of the opponent. We further devise ways to adapt related data sources for this task to provide more explicit supervision for incorporating the opponent’s preferences and offers, as a proxy to relying on granular utterance-level annotations. We show the utility of our proposed approach through extensive experiments based on two dialogue datasets. We find that the proposed data adaptations lead to strong performance in zero-shot and few-shot scenarios. Moreover, they allow the model to perform better than baselines while accessing fewer utterances from the opponent. We release our code to support future work in this direction.
2021
pdf
bib
abs
CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems
Kushal Chawla
|
Jaysa Ramirez
|
Rene Clever
|
Gale Lucas
|
Jonathan May
|
Jonathan Gratch
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in diverse and linguistically rich negotiations while maintaining a tractable, closed-domain environment. Inspired by the literature in human-human negotiations, we annotate persuasion strategies and perform correlation analysis to understand how the dialogue behaviors are associated with the negotiation performance. We further propose and evaluate a multi-task framework to recognize these strategies in a given utterance. We find that multi-task learning substantially improves the performance for all strategy labels, especially for the ones that are the most skewed. We release the dataset, annotations, and the code to propel future work in human-machine negotiations: 
https://github.com/kushalchawla/CaSiNo2018
pdf
bib
The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Ron Artstein
|
Jill Boberg
|
Alesia Gainer
|
Jonathan Gratch
|
Emmanuel Johnson
|
Anton Leuski
|
Gale Lucas
|
David Traum
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
2014
pdf
bib
abs
The Distress Analysis Interview Corpus of human and computer interviews
Jonathan Gratch
|
Ron Artstein
|
Gale Lucas
|
Giota Stratou
|
Stefan Scherer
|
Angela Nazarian
|
Rachel Wood
|
Jill Boberg
|
David DeVault
|
Stacy Marsella
|
David Traum
|
Skip Rizzo
|
Louis-Philippe Morency
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The Distress Analysis Interview Corpus (DAIC) contains clinical interviews designed to support the diagnosis of psychological distress conditions such as anxiety, depression, and post traumatic stress disorder. The interviews are conducted by humans, human controlled agents and autonomous agents, and the participants include both distressed and non-distressed individuals. Data collected include audio and video recordings and extensive questionnaire responses; parts of the corpus have been transcribed and annotated for a variety of verbal and non-verbal features. The corpus has been used to support the creation of an automated interviewer agent, and for research on the automatic identification of psychological distress.