New Protocols and Negative Results for Textual Entailment Data Collection

Samuel Bowman; Jennimaria Palomaki; Livio Baldini Soares; Emily Pitler

doi:10.18653/v1/2020.emnlp-main.658

New Protocols and Negative Results for Textual Entailment Data Collection

Samuel R. Bowman, Jennimaria Palomaki, Livio Baldini Soares, Emily Pitler

Abstract

Natural language inference (NLI) data has proven useful in benchmarking and, especially, as pretraining data for tasks requiring language understanding. However, the crowdsourcing protocol that was used to collect this data has known issues and was not explicitly optimized for either of these purposes, so it is likely far from ideal. We propose four alternative protocols, each aimed at improving either the ease with which annotators can produce sound training examples or the quality and diversity of those examples. Using these alternatives and a fifth baseline protocol, we collect and compare five new 8.5k-example training sets. In evaluations focused on transfer learning applications, our results are solidly negative, with models trained on our baseline dataset yielding good transfer performance to downstream tasks, but none of our four new methods (nor the recent ANLI) showing any improvements over that baseline. In a small silver lining, we observe that all four new protocols, especially those where annotators edit *pre-filled* text boxes, reduce previously observed issues with annotation artifacts.

Anthology ID:: 2020.emnlp-main.658
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8203–8214
Language:
URL:: https://aclanthology.org/2020.emnlp-main.658
DOI:: 10.18653/v1/2020.emnlp-main.658
Bibkey:
Cite (ACL):: Samuel R. Bowman, Jennimaria Palomaki, Livio Baldini Soares, and Emily Pitler. 2020. New Protocols and Negative Results for Textual Entailment Data Collection. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8203–8214, Online. Association for Computational Linguistics.
Cite (Informal):: New Protocols and Negative Results for Textual Entailment Data Collection (Bowman et al., EMNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2024-05/2020.emnlp-main.658.pdf
Video:: https://slideslive.com/38939009
Code: google-research-datasets/Textual-Entailment-New-Protocols
Data: ANLI, BoolQ, COPA, GLUE, MultiNLI, MultiRC, ReCoRD, SNLI, SWAG, SuperGLUE, WSC, WiC

PDF Search Code Video