From Word to World: Can Large Language Models be Implicit Text-based World Models?

Yixia Li; Hongru Wang; Jiahao Qiu; Zhenfei Yin; Dongdong Zhang; Cheng Qian; Zeping Li; Xiaoteng Ma; Guanhua Chen; Heng Ji

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Yixia Li, Hongru Wang, Jiahao Qiu, Zhenfei Yin, Dongdong Zhang, Cheng Qian, Zeping Li, Xiaoteng Ma, Guanhua Chen, Heng Ji

Abstract

Agentic learning increasingly hinges on interaction, yet real-world experience is expensive, limited, and often irreversible at inference time. World models promise to mitigate these limitations, but it remains unclear whether large language models can actually serve as reliable world models, and deliver concrete benefits to downstream agents. We investigate these questions in text-based environments, a controlled testbed that reframes language modeling as next-state prediction under interaction. We propose a three-level framework to evaluate LLM-based world models: (i) fidelity and consistency, (ii) scalability and robustness, and (iii) agent utility. Across five representative environments, we show that sufficiently trained world models capture coherent environment dynamics, scale predictably with data and model capacity, and unlock tangible agent improvements—for example, action verification boosts GPT-4o by 5.5% on WebShop, and warm-started RL achieves a 15% gain on SciWorld. Crucially, these benefits hinge on behavioral coverage and environment complexity, sharply characterizing when world modeling meaningfully advances agent learning.

Anthology ID:: 2026.acl-long.366
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8084–8111
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.366/
DOI:
Bibkey:
Cite (ACL):: Yixia Li, Hongru Wang, Jiahao Qiu, Zhenfei Yin, Dongdong Zhang, Cheng Qian, Zeping Li, Xiaoteng Ma, Guanhua Chen, and Heng Ji. 2026. From Word to World: Can Large Language Models be Implicit Text-based World Models?. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8084–8111, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: From Word to World: Can Large Language Models be Implicit Text-based World Models? (Li et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.366.pdf
Checklist:: 2026.acl-long.366.checklist.pdf

PDF Cite Search Checklist Fix data