Sizhe Gao
2025
Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding
Yun-Shiuan Chuang
|
Sameer Narendran
|
Nikunj Harlalka
|
Alexander Cheung
|
Sizhe Gao
|
Siddharth Suresh
|
Junjie Hu
|
Timothy T. Rogers
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Guesstimation—the task of making approximate quantitative estimates about objects or events—is a common real-world skill, yet remains underexplored in large language model (LLM) research. We introduce three guesstimation datasets: MARBLES, FUTURE, and ELECPRED, spanning physical estimation (e.g., how many marbles fit in a cup) to abstract predictions (e.g., the 2024 U.S. presidential election). Inspired by the social science concept of Wisdom of Crowds (WOC)—where the median of multiple estimates improves accuracy—we propose WOC decoding for LLMs. We replicate WOC effects in human participants and find that LLMs exhibit similar benefits: median aggregation across sampled responses consistently improves accuracy over greedy, self-consistency decoding, and mean decoding. This suggests that LLMs encode a world model that supports approximate reasoning. Our results position guesstimation as a useful probe of LLM world knowledge and highlight WOC decoding as a strategy for enhancing LLM guesstimation performance on real-world tasks.
Search
Fix author
Co-authors
- Alexander Cheung 1
- Yun-Shiuan Chuang 1
- Nikunj Harlalka 1
- Junjie Hu 1
- Sameer Narendran 1
- show all...