Abstract
Complex Word Identification (CWI) is a task for the identification of words that are challenging for second-language learners to read. Even though the use of neural classifiers is now common in CWI, the interpretation of their parameters remains difficult. This paper analyzes neural CWI classifiers and shows that some of their parameters can be interpreted as vocabulary size. We present a novel formalization of vocabulary size measurement methods that are practiced in the applied linguistics field as a kind of neural classifier. We also contribute to building a novel dataset for validating vocabulary testing and readability via crowdsourcing.- Anthology ID:
- 2020.bea-1.17
- Volume:
- Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- July
- Year:
- 2020
- Address:
- Seattle, WA, USA → Online
- Editors:
- Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Nitin Madnani, Ildikó Pilán, Helen Yannakoudakis, Torsten Zesch
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 171–176
- Language:
- URL:
- https://aclanthology.org/2020.bea-1.17
- DOI:
- 10.18653/v1/2020.bea-1.17
- Cite (ACL):
- Yo Ehara. 2020. Interpreting Neural CWI Classifiers’ Weights as Vocabulary Size. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 171–176, Seattle, WA, USA → Online. Association for Computational Linguistics.
- Cite (Informal):
- Interpreting Neural CWI Classifiers’ Weights as Vocabulary Size (Ehara, BEA 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.bea-1.17.pdf