Guirong Bai


2020

pdf bib
Pre-trained Language Model Based Active Learning for Sentence Matching
Guirong Bai | Shizhu He | Kang Liu | Jun Zhao | Zaiqing Nie
Proceedings of the 28th International Conference on Computational Linguistics

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria from the pre-trained language model to measure instances and help select more effective instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.