Miao Liu
2025
Evaluating the Prompt Steerability of Large Language Models
Erik Miehling
|
Michael Desmond
|
Karthikeyan Natesan Ramamurthy
|
Elizabeth M. Daly
|
Kush R. Varshney
|
Eitan Farchi
|
Pierre Dognin
|
Jesus Rios
|
Djallel Bouneffouf
|
Miao Liu
|
Prasanna Sattigeri
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Building pluralistic AI requires designing models that are able to be shaped to represent a wide range of value systems and cultures. Achieving this requires first being able to evaluate the degree to which a given model is capable of reflecting various personas. To this end, we propose a benchmark for evaluating the steerability of model personas as a function of prompting. Our design is based on a formal definition of prompt steerability, which analyzes the degree to which a model’s joint behavioral distribution can be shifted from its baseline. By defining steerability indices and inspecting how these indices change as a function of steering effort, we can estimate the steerability of a model across various persona dimensions and directions. Our benchmark reveals that the steerability of many current models is limited — due to both a skew in their baseline behavior and an asymmetry in their steerability across many persona dimensions. We release an implementation of our benchmark at https://github.com/IBM/prompt-steering.
2023
Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduction Games
Bolin Lai
|
Hongxin Zhang
|
Miao Liu
|
Aryan Pariani
|
Fiona Ryan
|
Wenqi Jia
|
Shirley Anugrah Hayati
|
James Rehg
|
Diyi Yang
Findings of the Association for Computational Linguistics: ACL 2023
Persuasion modeling is a key building block for conversational agents. Existing works in this direction are limited to analyzing textual dialogue corpus. We argue that visual signals also play an important role in understanding human persuasive behaviors. In this paper, we introduce the first multimodal dataset for modeling persuasion behaviors. Our dataset includes 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes. We provide extensive experiments to show how dialogue context and visual signals benefit persuasion strategy prediction. We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes. Our dataset can be found at https://persuasion-deductiongame. socialai-data.org. The codes and models are available at https://github.com/SALT-NLP/PersuationGames.
Search
Fix data
Co-authors
- Djallel Bouneffouf 1
- Elizabeth M. Daly 1
- Michael Desmond 1
- Pierre Dognin 1
- Eitan Farchi 1
- show all...