AWARE: Agentic Knowledge Warehousing for Contextual Intelligence

Hongjin Qian, Siqi Bao, Zhao Cao, Zheng Liu


Abstract
Information seeking bridges the knowledge gap between a query and its answer. Although LLMs perform well broadly, their ability to close this gap is limited by pretraining and degrades on specialized or up-to-date queries. A common remedy augments LLMs with external knowledge, either by injecting retrieved evidence into context or interleaving retrieval with reasoning. The former limits exploration of layered dependencies, while the latter is bounded by context length, constraining efficiency and scalability. For complex tasks with intricate dependencies and large text volumes, both approaches become inadequate.To tackle this bottleneck, we present AWARE (Agentic Knowledge Warehouse), an agentic knowledge warehousing framework that transforms heterogeneous, unstructured data into minimal, task-conditioned knowledge representations consumable by LLMs. Rather than exposing raw text, AWARE constructs knowledge through intent planning, online multi-threaded exploration, and map-reduce evidence integration, producing compact, LLM-ready context under finite budgets. Specifically, it applies offline document structuring to generate document headers that support controlled access, performs exploration with targeted refinement to recover layered information dependencies, and integrates distributed evidence into task-aware representations for downstream answer generation. Experiments on GAIA, WebWalker, and BrowseComp-Plus show improvements over all baselines
Anthology ID:
2026.findings-acl.120
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2530–2540
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.120/
DOI:
Bibkey:
Cite (ACL):
Hongjin Qian, Siqi Bao, Zhao Cao, and Zheng Liu. 2026. AWARE: Agentic Knowledge Warehousing for Contextual Intelligence. In Findings of the Association for Computational Linguistics: ACL 2026, pages 2530–2540, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
AWARE: Agentic Knowledge Warehousing for Contextual Intelligence (Qian et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.120.pdf
Checklist:
 2026.findings-acl.120.checklist.pdf