Abstract
We describe a method for rapidly creating language proficiency assessments, and provide experimental evidence that such tests can be valid, reliable, and secure. Our approach is the first to use machine learning and natural language processing to induce proficiency scales based on a given standard, and then use linguistic models to estimate item difficulty directly for computer-adaptive testing. This alleviates the need for expensive pilot testing with human subjects. We used these methods to develop an online proficiency exam called the Duolingo English Test, and demonstrate that its scores align significantly with other high-stakes English assessments. Furthermore, our approach produces test scores that are highly reliable, while generating item banks large enough to satisfy security requirements.- Anthology ID:
- 2020.tacl-1.17
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 8
- Month:
- Year:
- 2020
- Address:
- Cambridge, MA
- Editors:
- Mark Johnson, Brian Roark, Ani Nenkova
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 247–263
- Language:
- URL:
- https://aclanthology.org/2020.tacl-1.17
- DOI:
- 10.1162/tacl_a_00310
- Cite (ACL):
- Burr Settles, Geoffrey T. LaFlair, and Masato Hagiwara. 2020. Machine Learning–Driven Language Assessment. Transactions of the Association for Computational Linguistics, 8:247–263.
- Cite (Informal):
- Machine Learning–Driven Language Assessment (Settles et al., TACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2020.tacl-1.17.pdf