Probing Classifiers: Promises, Shortcomings, and Advances

Yonatan Belinkov


Abstract
Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple—a classifier is trained to predict some linguistic property from a model’s representations—and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological limitations of this approach. This squib critically reviews the probing classifiers framework, highlighting their promises, shortcomings, and advances.
Anthology ID:
2022.cl-1.7
Volume:
Computational Linguistics, Volume 48, Issue 1 - March 2022
Month:
March
Year:
2022
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
207–219
Language:
URL:
https://aclanthology.org/2022.cl-1.7
DOI:
10.1162/coli_a_00422
Bibkey:
Cite (ACL):
Yonatan Belinkov. 2022. Probing Classifiers: Promises, Shortcomings, and Advances. Computational Linguistics, 48(1):207–219.
Cite (Informal):
Probing Classifiers: Promises, Shortcomings, and Advances (Belinkov, CL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2022.cl-1.7.pdf