Bingjin Chen
2026
Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models
Shaoning Sun | Mingzhu Cai | Huang He | Bingjin Chen | Siqi Bao | Yujiu Yang | Hua Wu | Haifeng Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shaoning Sun | Mingzhu Cai | Huang He | Bingjin Chen | Siqi Bao | Yujiu Yang | Hua Wu | Haifeng Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Language model families exhibit striking disparity in their capacity to benefit from reinforcement learning: under identical training, models like Qwen achieve substantial gains, while others like Llama yield limited improvements. Complementing data-centric approaches, we reveal that this disparity reflects a hidden structural property: **distributional clarity** in probability space. Through a three-stage analysis—from phenomenon to mechanism to interpretation—we uncover that RL-friendly models exhibit intra-class compactness and inter-class separation in their probability assignments to correct vs. incorrect responses. We quantify this clarity using the **Silhouette Coefficient** (S) and demonstrate that (1) high S correlates strongly with RL performance; (2) low S is associated with severe logic errors and reasoning instability. To confirm this property, we introduce a Silhouette-Aware Reweighting strategy that prioritizes low-S samples during training. Experiments across six mathematical benchmarks show consistent improvements across all model families, with gains up to 5.9 points on AIME24. Our work establishes distributional clarity as a fundamental, trainable property underlying RL-Friendliness.