Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models

Henry Moss, David Leslie, Paul Rayson


Abstract
K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so our performance estimates are in fact stochastic, with variability that can be substantial for natural language processing tasks. We demonstrate that these unstable estimates cannot be relied upon for effective parameter tuning. The resulting tuned parameters are highly sensitive to how our data is partitioned, meaning that we often select sub-optimal parameter choices and have serious reproducibility issues. Instead, we propose to use the less variable J-K-fold CV, in which J independent K-fold cross validations are used to assess performance. Our main contributions are extending J-K-fold CV from performance estimation to parameter tuning and investigating how to choose J and K. We argue that variability is more important than bias for effective tuning and so advocate lower choices of K than are typically seen in the NLP literature and instead use the saved computation to increase J. To demonstrate the generality of our recommendations we investigate a wide range of case-studies: sentiment classification (both general and target-specific), part-of-speech tagging and document classification.
Anthology ID:
C18-1252
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2978–2989
Language:
URL:
https://aclanthology.org/C18-1252
DOI:
Bibkey:
Cite (ACL):
Henry Moss, David Leslie, and Paul Rayson. 2018. Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2978–2989, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models (Moss et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/C18-1252.pdf
Code
 henrymoss/COLING2018
Data
IMDb Movie Reviews