Exploring Practical Gaps in Using Cross Entropy to Implement Maximum Mutual Information Criterion for Rationalization
Wei Liu, Zhiying Deng, Zhongyu Niu, Jun Wang, Haozhao Wang, Ruixuan Li
Abstract
Rationalization is a framework that aims to build self-explanatory NLP models by extracting a subset of human-intelligible pieces of their inputting texts. It involves a cooperative game where a selector selects the most human-intelligible parts of the input as the rationale, followed by a predictor that makes predictions based on these selected rationales. Existing literature uses the cross-entropy between the model’s predictions and the ground-truth labels to measure the informativeness of the selected rationales, guiding the selector to choose better ones. In this study, we first theoretically analyze the objective of rationalization by decomposing it into two parts: the model-agnostic informativeness of the rationale candidates and the predictor’s degree of fit. We then provide various empirical evidence to support that, under this framework, the selector tends to sample from a limited small region, causing the predictor to overfit these localized areas. This results in a significant mismatch between the cross-entropy objective and the informativeness of the rationale candidates, leading to suboptimal solutions. To address this issue, we propose a simple yet effective method that introduces random vicinal1 perturbations to the selected rationale candidates. This approach broadens the predictor’s assessment to a vicinity around the selected rationale candidate. Compared to recent competitive methods, our method significantly improves rationale quality (by up to 6.6%) across six widely used classification datasets. The term “vicinal” is borrowed from vicinal risk minimization (Chapelle et al., 2000); “vicinal” means neighboring or adjacent.- Anthology ID:
- 2025.tacl-1.28
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 13
- Month:
- Year:
- 2025
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 577–594
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-07/2025.tacl-1.28/
- DOI:
- 10.1162/tacl_a_00758
- Cite (ACL):
- Wei Liu, Zhiying Deng, Zhongyu Niu, Jun Wang, Haozhao Wang, and Ruixuan Li. 2025. Exploring Practical Gaps in Using Cross Entropy to Implement Maximum Mutual Information Criterion for Rationalization. Transactions of the Association for Computational Linguistics, 13:577–594.
- Cite (Informal):
- Exploring Practical Gaps in Using Cross Entropy to Implement Maximum Mutual Information Criterion for Rationalization (Liu et al., TACL 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-07/2025.tacl-1.28.pdf