Towards a Task-Agnostic Model of Difficulty Estimation for Supervised Learning Tasks
Antonio Laverghetta Jr., Jamshidbek Mirzakhalov, John Licato
Abstract
Curriculum learning, a training strategy where training data are ordered based on their difficulty, has been shown to improve performance and reduce training time on various NLP tasks. While much work over the years has developed novel approaches for generating curricula, these strategies are typically only suited for the task they were designed for. This work explores developing a task-agnostic model for problem difficulty and applying it to the Stanford Natural Language Inference (SNLI) dataset. Using the human responses that come with the dev set of SNLI, we train both regression and classification models to predict how many annotators will answer a question correctly and then project the difficulty estimates onto the full SNLI train set to create the curriculum. We argue that our curriculum is effectively capturing difficulty for this task through various analyses of both the model and the predicted difficulty scores.- Anthology ID:
- 2020.aacl-srw.3
- Volume:
- Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop
- Month:
- December
- Year:
- 2020
- Address:
- Suzhou, China
- Editors:
- Boaz Shmueli, Yin Jou Huang
- Venue:
- AACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 16–23
- Language:
- URL:
- https://aclanthology.org/2020.aacl-srw.3
- DOI:
- Cite (ACL):
- Antonio Laverghetta Jr., Jamshidbek Mirzakhalov, and John Licato. 2020. Towards a Task-Agnostic Model of Difficulty Estimation for Supervised Learning Tasks. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 16–23, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Towards a Task-Agnostic Model of Difficulty Estimation for Supervised Learning Tasks (Laverghetta Jr. et al., AACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.aacl-srw.3.pdf
- Code
- amhrlab/supervised-cl
- Data
- SNLI