Abstract
“Pre-trained Language Models (PLMs), as parametric-based eager learners, have become thede-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fittingand isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based clas-sifiers. From the methodological level, we propose to adopt k-NN with textual representationsof PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process.(2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs’classifier. At the heart of our approach is the implementation of k-NN-calibrated training, whichtreats predicted results as indicators for easy versus hard examples during the training process.From the perspective of the diversity of application scenarios, we conduct extensive experimentson fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings,respectively, across eight diverse end-tasks. We hope our exploration will encourage the commu-nity to revisit the power of classical methods for efficient NLP1.”- Anthology ID:
- 2023.ccl-1.75
- Volume:
- Proceedings of the 22nd Chinese National Conference on Computational Linguistics
- Month:
- August
- Year:
- 2023
- Address:
- Harbin, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 889–897
- Language:
- English
- URL:
- https://aclanthology.org/2023.ccl-1.75
- DOI:
- Cite (ACL):
- Li Lei, Chen Jing, Tian Botzhong, and Zhang Ningyu. 2023. Revisiting k-NN for Fine-tuning Pre-trained Language Models. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 889–897, Harbin, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Revisiting k-NN for Fine-tuning Pre-trained Language Models (Lei et al., CCL 2023)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2023.ccl-1.75.pdf