Abstract
We study the problem of generating counterfactual text for a classifier as a means for understanding and debugging classification. Given a textual input and a classification model, we aim to minimally alter the text to change the model’s prediction. White-box approaches have been successfully applied to similar problems in vision where one can directly optimize the continuous input. Optimization-based approaches become difficult in the language domain due to the discrete nature of text. We bypass this issue by directly optimizing in the latent space and leveraging a language model to generate candidate modifications from optimized latent representations. We additionally use Shapley values to estimate the combinatoric effect of multiple changes. We then use these estimates to guide a beam search for the final counterfactual text. We achieve favorable performance compared to recent white-box and black-box baselines using human and automatic evaluations. Ablation studies show that both latent optimization and the use of Shapley values improve success rate and the quality of the generated counterfactuals.- Anthology ID:
- 2021.emnlp-main.452
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5578–5593
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.452
- DOI:
- 10.18653/v1/2021.emnlp-main.452
- Cite (ACL):
- Xiaoli Fern and Quintin Pope. 2021. Text Counterfactuals via Latent Optimization and Shapley-Guided Search. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5578–5593, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Text Counterfactuals via Latent Optimization and Shapley-Guided Search (Fern & Pope, EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2021.emnlp-main.452.pdf
- Code
- QuintinPope/CLOSS
- Data
- IMDb Movie Reviews