How Value Induction Reshapes LLM Behavior
Arnav Arora, Natalie Schluter, Katherine Metcalf, Maartje Ter Hoeve
Abstract
Conversational Large Language Models are post-trained on language that expresses specific behavioural traits, such as curiosity, open-mindedness, and empathy, and values, such as helpfulness, harmlessness, and honesty. This is done to increase utility, ensure safety, and improve the user experience of the people interacting with the model. However, values are complex and inter-related - incorporating one can modify behaviour on another. Further, incorporating certain values can make models more addictive or sycophantic, potentially having a detrimental effect on the user interacting with it. We investigate these and other unintended effects of value incorporation into models. We fine-tune models using value subsets of existing preference datasets, measuring the effect of value induction of 15 values on safety, anthropomorphism, and various QA benchmarks. We find that i) inducing values also leads to expression of other related, and sometimes contrastive values, ii) inducing positive values increases safety, and iii) all values increase anthropomorphic language use by models, making them more validating and sycophantic.- Anthology ID:
- 2026.findings-acl.1302
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 26131–26152
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1302/
- DOI:
- Cite (ACL):
- Arnav Arora, Natalie Schluter, Katherine Metcalf, and Maartje Ter Hoeve. 2026. How Value Induction Reshapes LLM Behavior. In Findings of the Association for Computational Linguistics: ACL 2026, pages 26131–26152, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- How Value Induction Reshapes LLM Behavior (Arora et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1302.pdf