DefGen-Bench: A Benchmark for Chinese Criminal Defence Opinion Generation in LegalAI

Senbo Zhang; Qiqi Wang; Fanghao Lou; Guanyu Chen; Yihong Pan; Huijia Li; Qian Liu

DefGen-Bench: A Benchmark for Chinese Criminal Defence Opinion Generation in LegalAI

Senbo Zhang, Qiqi Wang, Fanghao Lou, Guanyu Chen, Yihong Pan, Huijia Li, Qian Liu

Abstract

A defence opinion is an essential step in criminal proceedings, yet it has not been systematically formulated or evaluated as a specific LegalAI task. Grounded in legal principles and practice, we formulate this task as generating a structured defence opinion conditioned jointly on an indictment and the defendant’s stated opinion, which often present conflicting claims. We formalize this setting as a dual-perspective generation problem and introduce DefGen-Bench, a benchmark comprising several Chinese criminal cases with expert-reviewed reference defence opinions. We evaluate eight large language models (LLMs) on this task and observe that existing models tend to mirror the defendant’s opinion, thereby overlooking more appropriate defence strategies. To address this challenge, we propose Knowledge-Enhanced Highlighted Indictment (KHI), a legal knowledge–guided input enhancement method applicable to both open- and closed-source LLMs. Experiments demonstrate consistent improvements across all evaluated LLMs, validating the effectiveness of the proposed approach.

Anthology ID:: 2026.acl-long.1635
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35378–35392
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1635/
DOI:
Bibkey:
Cite (ACL):: Senbo Zhang, Qiqi Wang, Fanghao Lou, Guanyu Chen, Yihong Pan, Huijia Li, and Qian Liu. 2026. DefGen-Bench: A Benchmark for Chinese Criminal Defence Opinion Generation in LegalAI. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 35378–35392, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: DefGen-Bench: A Benchmark for Chinese Criminal Defence Opinion Generation in LegalAI (Zhang et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1635.pdf
Checklist:: 2026.acl-long.1635.checklist.pdf

PDF Cite Search Checklist Fix data