Joint learning of frequency and word embeddings for multilingual readability assessment

Dieu-Thu Le, Cam-Tu Nguyen, Xiaoliang Wang


Abstract
This paper describes two models that employ word frequency embeddings to deal with the problem of readability assessment in multiple languages. The task is to determine the difficulty level of a given document, i.e., how hard it is for a reader to fully comprehend the text. The proposed models show how frequency information can be integrated to improve the readability assessment. The experimental results testing on both English and Chinese datasets show that the proposed models improve the results notably when comparing to those using only traditional word embeddings.
Anthology ID:
W18-3714
Volume:
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Yuen-Hsien Tseng, Hsin-Hsi Chen, Vincent Ng, Mamoru Komachi
Venue:
NLP-TEA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
103–107
Language:
URL:
https://aclanthology.org/W18-3714
DOI:
10.18653/v1/W18-3714
Bibkey:
Cite (ACL):
Dieu-Thu Le, Cam-Tu Nguyen, and Xiaoliang Wang. 2018. Joint learning of frequency and word embeddings for multilingual readability assessment. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, pages 103–107, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Joint learning of frequency and word embeddings for multilingual readability assessment (Le et al., NLP-TEA 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/W18-3714.pdf