A Neural Pairwise Ranking Model for Readability Assessment

Justin Lee, Sowmya Vajjala


Abstract
Automatic Readability Assessment (ARA), the task of assigning a reading level to a text, is traditionally treated as a classification problem in NLP research. In this paper, we propose the first neural, pairwise ranking approach to ARA and compare it with existing classification, regression, and (non-neural) ranking methods. We establish the performance of our approach by conducting experiments with three English, one French and one Spanish datasets. We demonstrate that our approach performs well in monolingual single/cross corpus testing scenarios and achieves a zero-shot cross-lingual ranking accuracy of over 80% for both French and Spanish when trained on English data. Additionally, we also release a new parallel bilingual readability dataset, that could be useful for future research. To our knowledge, this paper proposes the first neural pairwise ranking model for ARA, and shows the first results of cross-lingual, zero-shot evaluation of ARA with neural models.
Anthology ID:
2022.findings-acl.300
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3802–3813
Language:
URL:
https://aclanthology.org/2022.findings-acl.300
DOI:
10.18653/v1/2022.findings-acl.300
Bibkey:
Cite (ACL):
Justin Lee and Sowmya Vajjala. 2022. A Neural Pairwise Ranking Model for Readability Assessment. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3802–3813, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
A Neural Pairwise Ranking Model for Readability Assessment (Lee & Vajjala, Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-acl.300.pdf
Software:
 2022.findings-acl.300.software.zip
Code
 jlee118/nprm
Data
Newsela