On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers

Marius Mosbach; Anna Khokhlova; Michael A. Hedderich; Dietrich Klakow

doi:10.18653/v1/2020.findings-emnlp.227

On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers

Marius Mosbach, Anna Khokhlova, Michael A. Hedderich, Dietrich Klakow

Abstract

Fine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.

Anthology ID:: 2020.findings-emnlp.227
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2502–2516
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.227
DOI:: 10.18653/v1/2020.findings-emnlp.227
Bibkey:
Cite (ACL):: Marius Mosbach, Anna Khokhlova, Michael A. Hedderich, and Dietrich Klakow. 2020. On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2502–2516, Online. Association for Computational Linguistics.
Cite (Informal):: On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers (Mosbach et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/2020.findings-emnlp.227.pdf
Data: CoLA, GLUE, SQuAD, SST, SST-2, WikiText-2

PDF Search