Cross-Lingual Leveled Reading Based on Language-Invariant Features

Simin Rao, Hua Zheng, Sujian Li


Abstract
Leveled reading (LR) aims to automatically classify texts by the cognitive levels of readers, which is fundamental in providing appropriate reading materials regarding different reading capabilities. However, most state-of-the-art LR methods rely on the availability of copious annotated resources, which prevents their adaptation to low-resource languages like Chinese. In our work, to tackle LR in Chinese, we explore how different language transfer methods perform on English-Chinese LR. Specifically, we focus on adversarial training and cross-lingual pre-training method to transfer the LR knowledge learned from annotated data in the resource-rich English language to Chinese. For evaluation, we first introduce the age-based standard to align datasets with different leveling standards. Then we conduct experiments in both zero-shot and few-shot settings. Comparing these two methods, quantitative and qualitative evaluations show that the cross-lingual pre-training method effectively captures the language-invariant features between English and Chinese. We conduct analysis to propose further improvement in cross-lingual LR.
Anthology ID:
2021.findings-emnlp.227
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2677–2682
Language:
URL:
https://preview.aclanthology.org/icon-24-ingestion/2021.findings-emnlp.227/
DOI:
10.18653/v1/2021.findings-emnlp.227
Bibkey:
Cite (ACL):
Simin Rao, Hua Zheng, and Sujian Li. 2021. Cross-Lingual Leveled Reading Based on Language-Invariant Features. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2677–2682, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual Leveled Reading Based on Language-Invariant Features (Rao et al., Findings 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2021.findings-emnlp.227.pdf
Video:
 https://preview.aclanthology.org/icon-24-ingestion/2021.findings-emnlp.227.mp4