Dimitrios Georgousis

2026

Evaluating Counterfactual Strategic Reasoning in Large Language Models
Dimitrios Georgousis | Maria Lymperaiou | Angeliki Dimitriou | Giorgos Filandrianos | Giorgos Stamou
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)

We evaluate whether LLMs adapt their strategic behavior when familiar games are counterfactually modified. We introduce a repeated-game evaluation framework covering Prisoner’s Dilemma and Rock–Paper–Scissors under default, label-perturbed, payoff-perturbed, and joint counterfactual variants. This design separates surface robustness to renamed actions from deeper sensitivity to changed incentives. Across multiple frontier LLMs, we find that label perturbations usually cause moderate degradation, whereas payoff perturbations expose stronger failures: LLMs often preserve canonical strategies even when the equilibrium structure changes. In RPS, several LLMs remain close to uniform play despite a payoff-counterfactual equilibrium requiring a biased mixed strategy. Behavioral and efficiency metrics further show that stronger or reasoning-enabled LLMs are not uniformly more strategic: some deliberate more without adapting faster. Overall, counterfactual repeated games provide a compact diagnostic for distinguishing robust incentive-sensitive behavior from brittle template-based strategic execution.

Co-authors

Venues

GEM1
WS1

Fix author