A hierarchical taxonomy for classifying hardness of inference tasks

Martin Gleize, Brigitte Grau


Abstract
Exhibiting inferential capabilities is one of the major goals of many modern Natural Language Processing systems. However, if attempts have been made to define what textual inferences are, few seek to classify inference phenomena by difficulty. In this paper we propose a hierarchical taxonomy for inferences, relatively to their hardness, and with corpus annotation and system design and evaluation in mind. Indeed, a fine-grained assessment of the difficulty of a task allows us to design more appropriate systems and to evaluate them only on what they are designed to handle. Each of seven classes is described and provided with examples from different tasks like question answering, textual entailment and coreference resolution. We then test the classes of our hierarchy on the specific task of question answering. Our annotation process of the testing data at the QA4MRE 2013 evaluation campaign reveals that it is possible to quantify the contrasts in types of difficulty on datasets of the same task.
Anthology ID:
L14-1119
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3034–3040
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1168_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Martin Gleize and Brigitte Grau. 2014. A hierarchical taxonomy for classifying hardness of inference tasks. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3034–3040, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
A hierarchical taxonomy for classifying hardness of inference tasks (Gleize & Grau, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1168_Paper.pdf