基于《同义词词林》的中文语体分类资源构建(Construction of Chinese register classification resources based on “Tongyici Cilin”)
Guojing Huang (黄国敬), Liwei Zhou (周立炜), Gaoqi Rao (饶高琦), Jiaojiao Zang (臧娇娇)
Abstract
“语体词是指在某一语体中专用的词语,是语体的语言要素和形式标记。而语体词的资源可以服务于与现实场景息息相关的NLP应用,但目前此类资源较为稀缺。对此,本文基于《大词林》,完成了“语体词标注”“语体(词)链条标注”和“平行构式标注”三个任务,建立了以语体词为基础的语体分类资源。本资源包含55,710条词语、5017个语体链条和433组平行构式。基于此本文分析中文语体词的分布概况、形态差异以及词义词性的分布情况。”- Anthology ID:
- 2022.ccl-1.39
- Volume:
- Proceedings of the 21st Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Nanchang, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 431–443
- Language:
- Chinese
- URL:
- https://aclanthology.org/2022.ccl-1.39
- DOI:
- Cite (ACL):
- Guojing Huang, Liwei Zhou, Gaoqi Rao, and Jiaojiao Zang. 2022. 基于《同义词词林》的中文语体分类资源构建(Construction of Chinese register classification resources based on “Tongyici Cilin”). In Proceedings of the 21st Chinese National Conference on Computational Linguistics, pages 431–443, Nanchang, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 基于《同义词词林》的中文语体分类资源构建(Construction of Chinese register classification resources based on “Tongyici Cilin”) (Huang et al., CCL 2022)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2022.ccl-1.39.pdf