The Conservative AI: Diagnosing Hold Bias and Reliability Limits in Persona-Based Monetary Policy Simulation

Giyong Kim, Sojung Kim


Abstract
We examine whether large language models (LLMs) can reliably simulate historical FOMC policy decisions and whether persona-based agentic deliberation improves performance. Using strictly time-consistent vintage economic information, we evaluate multiple state-of-the-art LLMs on a three-way Hike/Hold/Cut classification task in both single-agent and multi-agent settings. Single-LLM baselines achieve nontrivial accuracy and track broad policy regime shifts, establishing a simple but strong benchmark. However, we identify a systematic behavioral asymmetry that we term Hold bias: models disproportionately favor Hold decisions and remain reluctant to predict Cut outcomes even during easing cycles. This conservatism is especially costly around regime turning points, where reliable adaptation matters most. We further find that standard agentic workflows, including debate and consensus-style aggregation, do not mitigate this problem and often amplify caution rather than improve accuracy. Overall, our results show that plausible deliberation is not sufficient for trustworthy decision support. Progress will require agentic systems explicitly designed to diagnose and correct structural bias, rather than merely reproducing surface-level committee interaction.
Anthology ID:
2026.trustnlp-main.52
Volume:
Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026)
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Kai-Wei Chang, Ninareh Mehrabi, Satyapriya Krishna, Anubrata Das, Jwala Dhamala, Yang Trista Cao, Tharindu Kumarage, Anil Ramakrishna, Christos Christodoulopoulos, Yixin Wan, Aram Galystan, Anoop Kumar, Rahul Gupta
Venues:
TrustNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
663–677
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.52/
DOI:
Bibkey:
Cite (ACL):
Giyong Kim and Sojung Kim. 2026. The Conservative AI: Diagnosing Hold Bias and Reliability Limits in Persona-Based Monetary Policy Simulation. In Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026), pages 663–677, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
The Conservative AI: Diagnosing Hold Bias and Reliability Limits in Persona-Based Monetary Policy Simulation (Kim & Kim, TrustNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.52.pdf