Abstract
Feature importance is commonly used to explain machine predictions. While feature importance can be derived from a machine learning model with a variety of methods, the consistency of feature importance via different methods remains understudied. In this work, we systematically compare feature importance from built-in mechanisms in a model such as attention values and post-hoc methods that approximate model behavior such as LIME. Using text classification as a testbed, we find that 1) no matter which method we use, important features from traditional models such as SVM and XGBoost are more similar with each other, than with deep learning models; 2) post-hoc methods tend to generate more similar important features for two models than built-in methods. We further demonstrate how such similarity varies across instances. Notably, important features do not always resemble each other better when two models agree on the predicted label than when they disagree.- Anthology ID:
- D19-1046
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 486–495
- Language:
- URL:
- https://aclanthology.org/D19-1046
- DOI:
- 10.18653/v1/D19-1046
- Cite (ACL):
- Vivian Lai, Zheng Cai, and Chenhao Tan. 2019. Many Faces of Feature Importance: Comparing Built-in and Post-hoc Feature Importance in Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 486–495, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Many Faces of Feature Importance: Comparing Built-in and Post-hoc Feature Importance in Text Classification (Lai et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/D19-1046.pdf
- Code
- BoulderDS/feature-importance
- Data
- SST, SST-5