Abstract
In this paper, we propose a method to modify natural textual entailment problem datasets so that they better reflect a more precise notion of entailment. We apply this method to a subset of the Recognizing Textual Entailment datasets. We thus obtain a new corpus of entailment problems, which has the following three characteristics: 1. it is precise (does not leave out implicit hypotheses) 2. it is based on “real-world” texts (i.e. most of the premises were written for purposes other than testing textual entailment). 3. its size is 150. Broadly, the method that we employ is to make any missing hypotheses explicit using a crowd of experts. We discuss the relevance of our method in improving existing NLI datasets to be more fit for precise reasoning and we argue that this corpus can be the basis a first step towards wide-coverage testing of precise natural-language inference systems.- Anthology ID:
- 2020.lrec-1.844
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 6835–6840
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.844
- DOI:
- Cite (ACL):
- Jean-Philippe Bernardy and Stergios Chatzikyriakidis. 2020. Improving the Precision of Natural Textual Entailment Problem Datasets. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 6835–6840, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Improving the Precision of Natural Textual Entailment Problem Datasets (Bernardy & Chatzikyriakidis, LREC 2020)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2020.lrec-1.844.pdf