Class Explanations: the Role of Domain-Specific Content and Stop Words
Denitsa Saynova, Bastiaan Bruinsma, Moa Johansson, Richard Johansson
Abstract
We address two understudied areas related to explainability for neural text models. First, class explanations. What features are descriptive across a class, rather than explaining single input instances? Second, the type of features that are used for providing explanations. Does the explanation involve the statistical pattern of word usage or the presence of domain-specific content words? Here, we present a method to extract both class explanations and strategies to differentiate between two types of explanations – domain-specific signals or statistical variations in frequencies of common words. We demonstrate our method using a case study in which we analyse transcripts of political debates in the Swedish Riksdag.- Anthology ID:
- 2023.nodalida-1.12
- Volume:
- Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
- Month:
- May
- Year:
- 2023
- Address:
- Tórshavn, Faroe Islands
- Editors:
- Tanel Alumäe, Mark Fishel
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- University of Tartu Library
- Note:
- Pages:
- 103–112
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2023.nodalida-1.12/
- DOI:
- Cite (ACL):
- Denitsa Saynova, Bastiaan Bruinsma, Moa Johansson, and Richard Johansson. 2023. Class Explanations: the Role of Domain-Specific Content and Stop Words. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 103–112, Tórshavn, Faroe Islands. University of Tartu Library.
- Cite (Informal):
- Class Explanations: the Role of Domain-Specific Content and Stop Words (Saynova et al., NoDaLiDa 2023)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2023.nodalida-1.12.pdf