Yeongheon Lee
2026
Inertia in Moral and Value Judgments of Large Language Models
Bruce W. Lee | Yeongheon Lee | Hyunsoo Cho
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Bruce W. Lee | Yeongheon Lee | Hyunsoo Cho
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) behave non-deterministically, and prompting has become a common method for steering their outputs.A popular strategy is to assign a persona to the model to produce more varied, context-sensitive responses, similar to how responses vary across human individuals.Against the expectation that persona prompting yields a wide range of opinions, our experiments show that LLMs keep consistent value orientations.We observe a persistent inertia in their responses, where certain moral and value dimensions (especially harm avoidance and fairness) stay skewed in one direction across persona settings.To study this, we use role-play at scale, which pairs randomized persona prompts with a macro-level analysis of model outputs.Our results point to strong internal biases and value preferences in LLMs, which we call value orientation and inertia. These models warrant scrutiny and adjustment before use in applications where balanced outputs matter.