Optimal and efficient text counterfactuals using Graph Neural Networks

Dimitris Lymperopoulos; Maria Lymperaiou; Giorgos Filandrianos; Giorgos Stamou

doi:10.18653/v1/2024.blackboxnlp-1.1

Optimal and efficient text counterfactuals using Graph Neural Networks

Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou

Abstract

As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We frame the search for optimal counterfactual interventions as a graph assignment problem and employ a GNN to solve it, thus achieving high efficiency. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster than other state-of-the-art counterfactual editors.

Anthology ID:: 2024.blackboxnlp-1.1
Volume:: Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Yonatan Belinkov, Najoung Kim, Jaap Jumelet, Hosein Mohebbi, Aaron Mueller, Hanjie Chen
Venues:: BlackboxNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–14
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.blackboxnlp-1.1/
DOI:: 10.18653/v1/2024.blackboxnlp-1.1
Bibkey:
Cite (ACL):: Dimitris Lymperopoulos, Maria Lymperaiou, Giorgos Filandrianos, and Giorgos Stamou. 2024. Optimal and efficient text counterfactuals using Graph Neural Networks. In Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 1–14, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: Optimal and efficient text counterfactuals using Graph Neural Networks (Lymperopoulos et al., BlackboxNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.blackboxnlp-1.1.pdf

PDF Cite Search Fix data