Abstract
本文提出了可读性语料库构建的改进方法,基于该方法,构建了规模更大的汉语句子可读性语料库。该语料库在句子绝对难度评估任务上的准确率达到0.7869,相对前人工作提升了0.15以上,证明了改进方法的有效性。将深度学习方法应用于汉语可读性评估,探究了不同深度学习方法自动捕获难度特征的能力,并进仛步探究了向深度学习特征中融入不同层面的语难度特征对模型整体性能的影响。实验结果显示,不同深度学习模型的难度特征捕获能力不尽相同,语言难度特征可以不同程度地提高深度学习模型的难度表征能力。- Anthology ID:
- 2020.ccl-1.68
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Editors:
- Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 731–742
- Language:
- Chinese
- URL:
- https://aclanthology.org/2020.ccl-1.68
- DOI:
- Cite (ACL):
- Yuling Tang and Dong Yu. 2020. 结合深度学习和语言难度特征的句子可读性计算方法(The method of calculating sentence readability combined with deep learning and language difficulty characteristics). In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 731–742, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 结合深度学习和语言难度特征的句子可读性计算方法(The method of calculating sentence readability combined with deep learning and language difficulty characteristics) (Tang & Yu, CCL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.ccl-1.68.pdf
- Data
- OneStopEnglish