Zehan Ma
2026
Pub-LawBench: Public-Oriented Benchmarking for LegalAI
Qiaoyu Zheng | Zehan Ma | Yijing Zhang | Qiqi Wang | Huijia Li | Qian Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Qiaoyu Zheng | Zehan Ma | Yijing Zhang | Qiqi Wang | Huijia Li | Qian Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are playing an increasingly pivotal role in LegalAI. However, existing benchmarks are primarily tailored for legal professionals, emphasizing deep reasoning and explainability. While public-facing legal applications demand outputs that are direct, actionable, and accessible, a need largely overlooked by current evaluation frameworks. To bridge this gap, we propose a public-oriented LegalAI benchmark grounded in legal functionalism and genre analysis. Specifically, we categorize public legal demands into two core tasks: Instant Question Answering and Legal Text Generation. We further introduce three public-oriented evaluation dimensions: legal normativity, content relevance, and format usability, which collectively assess the practical validity and user readiness of model outputs. To reflect real-world lay user usage, we evaluate 17 LLMs on Pub-LawBench using only simple prompts and Chain-of-Thought under a vanilla inference setting, excluding complex techniques like RAG or agent-based methods inaccessible to non-experts. Experiments reveal limitations of current LLMs in delivering effective public-oriented legal assistance, highlighting the need for more user-centric model development and benchmarking. Our code and datasets are available for review at https://anonymous.4open.science/r/P-LawBench-E565/.