Benchmarking Long-tail Generalization with Likelihood Splits

Ameya Godbole; Robin Jia

doi:10.18653/v1/2023.findings-eacl.71

Benchmarking Long-tail Generalization with Likelihood Splits

Abstract

In order to reliably process natural language, NLP systems must generalize to the long tail of rare utterances. We propose a method to create challenging benchmarks that require generalizing to the tail of the distribution by re-splitting existing datasets. We create ‘Likelihood Splits’ where examples that are assigned lower likelihood by a pre-trained language model (LM) are placed in the test set, and more likely examples are in the training set. This simple approach can be customized to construct meaningful train-test splits for a wide range of tasks. Likelihood Splits surface more challenges than random splits: relative error rates of state-of-the-art models increase by 59% for semantic parsing on Spider, 93% for natural language inference on SNLI, and 33% for yes/no question answering on BoolQ, on our splits compared with the corresponding random splits. Moreover, Likelihood Splits create fairer benchmarks than adversarial filtering; when the LM used to create the splits is also employed as the task model, our splits do not unfairly penalize the LM.

Anthology ID:: 2023.findings-eacl.71
Volume:: Findings of the Association for Computational Linguistics: EACL 2023
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Editors:: Andreas Vlachos, Isabelle Augenstein
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 963–983
Language:
URL:: https://preview.aclanthology.org/nschneid-patch-2/2023.findings-eacl.71/
DOI:: 10.18653/v1/2023.findings-eacl.71
Bibkey:
Cite (ACL):: Ameya Godbole and Robin Jia. 2023. Benchmarking Long-tail Generalization with Likelihood Splits. In Findings of the Association for Computational Linguistics: EACL 2023, pages 963–983, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: Benchmarking Long-tail Generalization with Likelihood Splits (Godbole & Jia, Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2023.findings-eacl.71.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-2/2023.findings-eacl.71.mp4

PDF Cite Search Video Fix data