Yilun Zhou


ExSum: From Local Explanations to Model Understanding
Yilun Zhou | Marco Tulio Ribeiro | Julie Shah
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in informal model understanding derived from a handful of local explanations. In this paper, we introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding, and propose metrics for its quality assessment. On two domains, ExSum highlights various limitations in the current practice, helps develop accurate model understanding, and reveals easily overlooked properties of the model. We also connect understandability to other properties of explanations such as human alignment, robustness, and counterfactual similarity and plausibility.

The Irrationality of Neural Rationale Models
Yiming Zheng | Serena Booth | Julie Shah | Yilun Zhou
Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022)

Neural rationale models are popular for interpretable predictions of NLP tasks. In these, a selector extracts segments of the input text, called rationales, and passes these segments to a classifier for prediction. Since the rationale is the only information accessible to the classifier, it is plausibly defined as the explanation. Is such a characterization unconditionally correct? In this paper, we argue to the contrary, with both philosophical perspectives and empirical evidence suggesting that rationale models are, perhaps, less rational and interpretable than expected. We call for more rigorous evaluations of these models to ensure desired properties of interpretability are indeed achieved. The code for our experiments is at https://github.com/yimingz89/Neural-Rationale-Analysis.


Learning Household Task Knowledge from WikiHow Descriptions
Yilun Zhou | Julie Shah | Steven Schockaert
Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5)