QAmeleon: Multilingual QA with Only 5 Examples

Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata


Abstract
The availability of large, high-quality datasets has been a major driver of recent progress in question answering (QA). Such annotated datasets, however, are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-trained language models (PLMs) under a few-shot learning setting. Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are fine-tuned, thus avoiding costly annotation. Prompt tuning the PLM with only five examples per language delivers accuracy superior to translation-based baselines; it bridges nearly 60% of the gap between an English-only baseline and a fully-supervised upper bound fine-tuned on almost 50,000 hand-labeled examples; and consistently leads to improvements compared to directly fine-tuning a QA model on labeled examples in low resource settings. Experiments on the TyDiqa-GoldP and MLQA benchmarks show that few-shot prompt tuning for data synthesis scales across languages and is a viable alternative to large-scale annotation.1
Anthology ID:
2023.tacl-1.98
Volume:
Transactions of the Association for Computational Linguistics, Volume 11
Month:
Year:
2023
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1754–1771
Language:
URL:
https://aclanthology.org/2023.tacl-1.98
DOI:
10.1162/tacl_a_00625
Bibkey:
Cite (ACL):
Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, and Mirella Lapata. 2023. QAmeleon: Multilingual QA with Only 5 Examples. Transactions of the Association for Computational Linguistics, 11:1754–1771.
Cite (Informal):
QAmeleon: Multilingual QA with Only 5 Examples (Agrawal et al., TACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ijclclp-past-ingestion/2023.tacl-1.98.pdf