Abstract
There currently exists a multitude of pre-trained transformer language models (LMs) that are readily available. From a practical perspective, this raises the question of which pre-trained LM will perform best if fine-tuned for a specific downstream NLP task. However, exhaustively fine-tuning all available LMs to determine the best-fitting model is computationally infeasible. To address this problem, we present an approach that inexpensively estimates a ranking of the expected performance of a given set of candidate LMs for a given task. Following a layer-wise representation analysis, we extend existing approaches such as H-score and LogME by aggregating representations across all layers of the transformer model. We present an extensive analysis of 20 transformer LMs, 6 downstream NLP tasks, and various estimators (linear probing, kNN, H-score, and LogME). Our evaluation finds that averaging the layer representations significantly improves the Pearson correlation coefficient between the true model ranks and the estimate, increasing from 0.58 to 0.86 for LogME and from 0.65 to 0.88 for H-score.- Anthology ID:
- 2024.findings-acl.757
- Volume:
- Findings of the Association for Computational Linguistics ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand and virtual meeting
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12752–12768
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.757
- DOI:
- Cite (ACL):
- Lukas Garbaciauskas, Max Ploner, and Alan Akbik. 2024. Choose Your Transformer: Improved Transferability Estimation of Transformer Models on Classification Tasks. In Findings of the Association for Computational Linguistics ACL 2024, pages 12752–12768, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- Choose Your Transformer: Improved Transferability Estimation of Transformer Models on Classification Tasks (Garbaciauskas et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.757.pdf