CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models

Ling Shi, Deyi Xiong


Abstract
Large language models (LLMs) are possessed of numerous beneficial capabilities, yet their potential inclination harbors unpredictable risks that may materialize in the future. We hence propose CRiskEval, a Chinese dataset meticulously designed for gauging the risk proclivities inherent in LLMs such as resource acquisition and malicious coordination, as part of efforts for proactive preparedness. To curate CRiskEval, we define a new risk taxonomy with 7 types of frontier risks and 4 safety levels, including extremely hazardous,moderately hazardous, neutral and safe. We follow the philosophy of tendency evaluation to empirically measure the stated ”desire” of LLMs via fine-grained multiple-choice question answering. The dataset consists of 14,888 questions that simulate scenarios related to predefined 7 types of frontier risks. Each question is accompanied with 4 answer choices that state opinions or behavioral tendencies corresponding to the question. All answer choices are manually annotated with one of the defined risk levels so that we can easily build a fine-grained frontier risk profile for each assessed LLM. Extensive evaluation with CRiskEval on a spectrum of prevalent Chinese LLMs has unveiled a striking revelation: most models exhibit risk tendencies of more than 40% (weighted tendency to the four risk levels). Furthermore, a subtle increase in the model’s inclination toward urgent self-sustainability, power seeking and other dangerous goals becomes evident as the size of models increases. To promote further research on the frontier risk evaluation of LLMs, we publicly release our dataset at https://github.com/tjunlp-lab/CRiskEval.
Anthology ID:
2025.acl-long.670
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13638–13659
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.670/
DOI:
Bibkey:
Cite (ACL):
Ling Shi and Deyi Xiong. 2025. CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13638–13659, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models (Shi & Xiong, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.670.pdf