Analyzing and Internalizing Complex Policy Documents for LLM Agents

Jiateng Liu; Zhenhailong Wang; Xiaojiang Huang; Yingjie Li; Xiang Li; Chenlei Guo; Xing Fan; Ruhi Sarikaya; Heng Ji

Analyzing and Internalizing Complex Policy Documents for LLM Agents

Jiateng Liu, Zhenhailong Wang, Xiaojiang Huang, Yingjie Li, Xiang Li, Chenlei Guo, Xing Fan, Ruhi Sarikaya, Heng Ji

Abstract

Large language model agents rely on in-context policy documents encoding diverse business rules. As businesses scale, these documents grow, creating substantial computational overhead and motivating internalization methods that embed policy into model priors. Prior work focuses on generic prompts, but we find agentic policies span multiple complexity levels and demand heavier reasoning, posing greater challenges. We introduce an agentic benchmark generator with Controllable Complexity in agent policy across four levels, enabling systematic evaluation of agents under increasing complexity and providing a testbed for policy internalization. Our analysis shows that workflow-governing policy specifications are the hardest to reason over, and that SFT on gold trajectories with chain-of-thought is data-hungry and struggles at high complexity. We propose Category-Aware Policy Continued Pretraining, an automated pipeline that analyzes policies, extracts key specifications, categorizes them into factual, behavioral, and conditional types, and isolates those driving workflow complexity. This enables targeted “therapy” by synthesizing specialized training data for each type and improving internalization via an autoregressive pretraining loss. Extensive experiments show our synthetic data and objective consistently improve performance. Combined with SFT, our method outperforms the baseline across different settings, especially in data-sparse and high-complexity regimes, with gains up to 41% and 22% on Qwen-3-32B. Overall, we achieve 97.3% prompt reduction on our benchmark, and on 𝜏-Bench we further improve performance while reducing prompt requirements with very limited SFT data.

Anthology ID:: 2026.acl-long.767
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16814–16851
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.767/
DOI:
Bibkey:
Cite (ACL):: Jiateng Liu, Zhenhailong Wang, Xiaojiang Huang, Yingjie Li, Xiang Li, Chenlei Guo, Xing Fan, Ruhi Sarikaya, and Heng Ji. 2026. Analyzing and Internalizing Complex Policy Documents for LLM Agents. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16814–16851, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Analyzing and Internalizing Complex Policy Documents for LLM Agents (Liu et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.767.pdf
Checklist:: 2026.acl-long.767.checklist.pdf

PDF Cite Search Checklist Fix data