LLM-induced Rationales for More Compact Explainable Style Classification Models

Ahmad Aljanaideh; Saeb Ganideh

LLM-induced Rationales for More Compact Explainable Style Classification Models

Abstract

The complexity of recent natural language classification models led to interest in developing methods for improving the performance of explainable models (e.g. Logistic Regression). Existing methods focus on clustering word embeddings to discover fine-grained contextual features that can be used to train a linear model. While those methods help reduce the gap in performance between black-box models and explainable models, they are based on discovering a large number of features, and this affects interpretability. In this work, we propose a model that leverages Large Language Models (LLMs) and clustering algorithms to discover a compact set of interpretable features. The proposed model first uses GPT-4o mini to extract rationales (i.e. phrases which explain an item’s label) from labeled text, and then clusters those rationales to obtain a compact, interpretable feature space. Across 3 Style Classification tasks, the resulting features achieve comparable performance to word-cluster baselines on most tasks, while reducing the number of features by 85–99%. These results highlight the potential of LLMs to improve the compactness of explainable AI models.

Anthology ID:: 2026.findings-acl.1426
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 28571–28577
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1426/
DOI:
Bibkey:
Cite (ACL):: Ahmad Aljanaideh and Saeb Ganideh. 2026. LLM-induced Rationales for More Compact Explainable Style Classification Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 28571–28577, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: LLM-induced Rationales for More Compact Explainable Style Classification Models (Aljanaideh & Ganideh, Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1426.pdf
Checklist:: 2026.findings-acl.1426.checklist.pdf

PDF Cite Search Checklist Fix data