Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks

Tianyi Zhang


Abstract
Large language models (LLMs) encode rich internal representations of political ideology, but it remains unclear how these representations contribute to model decision-making, and how these latent dimensions interact with one another. In this work, we investigate whether ideological directions identified via linear probes—specifically, those predicting DW-NOMINATE scores from attention head activations—influence model behavior in downstream political tasks. We apply inference-time interventions to steer a decoder-only transformer along learned ideological directions, and evaluate their effect on three tasks: political bias detection, voting preference simulation, and bias neutralization via rewriting. Our results show that learned ideological representations generalize well to bias detection, but not as well to voting simulations, suggesting that political ideology is encoded in multiple, partially disentangled latent structures. We also observe asymmetries in how interventions affect liberal versus conservative outputs, raising concerns about pretraining-induced bias and post-training alignment effects. This work highlights the risks of using biased LLMs for politically sensitive tasks, and calls for deeper investigation into the interaction of social dimensions in model representations, as well as methods for steering them toward fairer, more transparent behavior.
Anthology ID:
2025.findings-emnlp.1267
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23349–23360
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1267/
DOI:
10.18653/v1/2025.findings-emnlp.1267
Bibkey:
Cite (ACL):
Tianyi Zhang. 2025. Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 23349–23360, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Probing Political Ideology in Large Language Models: How Latent Political Representations Generalize Across Tasks (Zhang, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1267.pdf
Checklist:
 2025.findings-emnlp.1267.checklist.pdf