PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models

Han Bao, Penghao Zhang, Yue Huang, Zhengqing Yuan, Yanchi Ru, SU Rui, Yujun Zhou, Xiangqi Wang, Kehan Guo, Nitesh V Chawla, Yanfang Ye, Xiangliang Zhang


Abstract
Large Language Models (LLMs) are increasingly integrated into real-world decision-making, including in the domain of public policy. Yet, their ability to comprehend and reason about policy-related content remains underexplored. To fill this gap, we present PolicyBench, the first large-scale bilingual benchmark evaluating policy comprehension, comprising 21K cases across a broad spectrum of policy areas, capturing the diversity and complexity of real-world governance. Following Bloom’s taxonomy, the benchmark assesses three core capabilities: (1) Memorization: factual recall of policy knowledge, (2) Understanding: conceptual and contextual reasoning, and (3) Application: problem-solving in real-life policy scenarios. Building on this benchmark, we further propose PolicyMoE, a domain-specialized Mixture-of-Experts (MoE) model with expert modules aligned to each cognitive level. The proposed models demonstrate stronger performance on application-oriented policy tasks than on memorization or conceptual understanding, and yields the highest accuracy on structured reasoning tasks. Our results reveal key limitations of current LLMs in policy understanding and suggest paths toward more reliable, policy-focused models
Anthology ID:
2026.findings-acl.107
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2249–2274
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.107/
DOI:
Bibkey:
Cite (ACL):
Han Bao, Penghao Zhang, Yue Huang, Zhengqing Yuan, Yanchi Ru, SU Rui, Yujun Zhou, Xiangqi Wang, Kehan Guo, Nitesh V Chawla, Yanfang Ye, and Xiangliang Zhang. 2026. PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 2249–2274, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models (Bao et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.107.pdf
Checklist:
 2026.findings-acl.107.checklist.pdf