LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework

Mengjie Zhao; Fei Mi; Yasheng Wang; Minglei Li; Xin Jiang; Qun Liu; Hinrich Schütze

doi:10.18653/v1/2022.findings-naacl.51

LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework

Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schuetze

Abstract

Vast efforts have been devoted to creating high-performance few-shot learners, i.e., large-scale pretrained language models (PLMs) that perform well with little downstream task training data. Training PLMs has incurred significant cost, but utilizing the few-shot learners is still challenging due to their enormous size. This work focuses on a crucial question: How to make effective use of these few-shot learners? We propose LMTurk, a novel approach that treats few-shotlearners as crowdsourcing workers. The rationale is that crowdsourcing workers are in fact few-shot learners: They are shown a few illustrative examples to learn about a task and then start annotating. LMTurk employs few-shot learners built upon PLMs as workers. We show that the resulting annotations can be utilized to train models that solve the task well and are small enough to be deployable in practical scenarios. Active learning is integrated into LMTurk to reduce the amount of queries made to PLMs, minimizing the computational cost of running PLM inference passes. Altogether, LMTurk is an important step towards making effective use of current PLMs.

Anthology ID:: 2022.findings-naacl.51
Volume:: Findings of the Association for Computational Linguistics: NAACL 2022
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 675–692
Language:
URL:: https://aclanthology.org/2022.findings-naacl.51
DOI:: 10.18653/v1/2022.findings-naacl.51
Bibkey:
Cite (ACL):: Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, and Hinrich Schuetze. 2022. LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 675–692, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework (Zhao et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2022.findings-naacl.51.pdf
Data: AG News, CoLA, GLUE, SST, SST-2, SST-5

PDF Search