Abstract
Text Classification is one of the most common tasks in Natural Language Processing. When proposing new classification models, practitioners select a sample of items the proposed model classified correctly while the baseline did not, and then try to observe patterns across those items to understand the proposed model’s strengths. However, this approach is not comprehensive and requires the effort of observing patterns across text items. In this work, we propose a new evaluation methodology for performing qualitative assessment over multiple classification models. The proposed methodology is driven to discover clusters of text items where each cluster’s items 1) exhibit a linguistic pattern and 2) the proposed model significantly outperforms the baseline when classifying such items. This helps practitioners in learning what their proposed model is powerful at capturing in comparison with the baseline model without having to perform this process manually. We use a fine-tuned BERT and Logistic Regression as the two models to compare with Sentiment Analysis as the downstream task. We show how our proposed evaluation methodology discovers various clusters of text items which BERT classifies significantly more accurately than the Logistic Regression baseline, thus providing insight into what BERT is powerful at capturing.- Anthology ID:
- 2024.lrec-main.1066
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 12187–12192
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.1066
- DOI:
- Cite (ACL):
- Ahmad Aljanaideh. 2024. New Evaluation Methodology for Qualitatively Comparing Classification Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12187–12192, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- New Evaluation Methodology for Qualitatively Comparing Classification Models (Aljanaideh, LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.lrec-main.1066.pdf