基于深加工语料库的《唐诗三百首》难度分级(The difficulty classification of ‘ Three Hundred Tang Poems ’ based on the deep processing corpus)

Yuyu Huang (黄宇宇), Xinyu Chen (陈欣雨), Minxuan Feng (冯敏萱), Yunuo Wang (王禹诺), Beiyuan Wang (蓓原王,), Bin Li (李斌)


Abstract
“为辅助中小学教材及读本中唐诗的选取,本文基于对《唐诗三百首》分词、词性、典故标记的深加工语料库,据诗句可读性创新性地构建了分级标准,共分4层,共计8项可量化指标:字层(通假字)、词层(双字词)、句层(特殊句式、标题长度、诗句长度)、艺术层(典故、其他修辞、描写手法)。据以上8项指标对语料库中313首诗评分,建立基于量化特征的向量空间模型,以K-means聚类算法将诗歌聚类以对应小学、初中和高中3个学段的唐诗学习。”
Anthology ID:
2023.ccl-1.43
Volume:
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
Month:
August
Year:
2023
Address:
Harbin, China
Editors:
Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
491–500
Language:
Chinese
URL:
https://aclanthology.org/2023.ccl-1.43
DOI:
Bibkey:
Cite (ACL):
Yuyu Huang, Xinyu Chen, Minxuan Feng, Yunuo Wang, Beiyuan Wang, and Bin Li. 2023. 基于深加工语料库的《唐诗三百首》难度分级(The difficulty classification of ‘ Three Hundred Tang Poems ’ based on the deep processing corpus). In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 491–500, Harbin, China. Chinese Information Processing Society of China.
Cite (Informal):
基于深加工语料库的《唐诗三百首》难度分级(The difficulty classification of ‘ Three Hundred Tang Poems ’ based on the deep processing corpus) (Huang et al., CCL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.ccl-1.43.pdf