On the Need of Cross Validation for Discourse Relation Classification

Wei Shi, Vera Demberg


Abstract
The task of implicit discourse relation classification has received increased attention in recent years, including two CoNNL shared tasks on the topic. Existing machine learning models for the task train on sections 2-21 of the PDTB and test on section 23, which includes a total of 761 implicit discourse relations. In this paper, we’d like to make a methodological point, arguing that the standard test set is too small to draw conclusions about whether the inclusion of certain features constitute a genuine improvement, or whether one got lucky with some properties of the test set, and argue for the adoption of cross validation for the discourse relation classification task by the community.
Anthology ID:
E17-2024
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
150–156
Language:
URL:
https://aclanthology.org/E17-2024
DOI:
Bibkey:
Cite (ACL):
Wei Shi and Vera Demberg. 2017. On the Need of Cross Validation for Discourse Relation Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 150–156, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
On the Need of Cross Validation for Discourse Relation Classification (Shi & Demberg, EACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/E17-2024.pdf