Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models

Wangtao Sun, ChenxiangZhang ChenxiangZhang, XueYou Zhang, Xuanqing Yu, Ziyang Huang, Haotian Xu, Shizhu He, Jun Zhao, Kang Liu


Abstract
"Although Large Language Models (LLMs) have demonstrated strong instruction-following abil-ity, they are further supposed to be controlled and guided by inferential rules in real-world scenarios to be safe, accurate, and intelligent. This demands the possession of inferential rule-following capability of LLMs. However, no prior work has made a clear evaluation of the inferential rule-following capability of LLMs. Previous studies that try to evaluate the inferential rule-following capability of LLMs fail to distinguish the inferential rule-following scenarios from the instruction-following scenarios. Therefore, this paper first clarifies the concept of inferential rule-following and proposes a comprehensive benchmark, RuleBench, to evaluate a diversified range of inferential rule-following abilities. Our experimental results on a variety of LLMs show that they are still limited in following rules. Our analysis based on the evaluation results provides insights into the improvements for LLMs toward a better inferential rule-following intelligent agent. We further propose Inferential Rule-Following Tuning (IRFT). The experimental results show that through IRFT, LLMs can learn abstract inferential rule-following abilities from purely synthetic data and then generalize to RuleBench. The data and code can be found at:https://gitee.com/forangel2014/llm-rule-following-code"
Anthology ID:
2025.ccl-1.79
Volume:
Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:
August
Year:
2025
Address:
Jinan, China
Editors:
Maosong Sun, Peiyong Duan, Zhiyuan Liu, Ruifeng Xu, Weiwei Sun
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1043–1066
Language:
URL:
https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.79/
DOI:
Bibkey:
Cite (ACL):
Wangtao Sun, ChenxiangZhang ChenxiangZhang, XueYou Zhang, Xuanqing Yu, Ziyang Huang, Haotian Xu, Shizhu He, Jun Zhao, and Kang Liu. 2025. Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 1043–1066, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):
Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models (Sun et al., CCL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.79.pdf