Abstract
本研究依据以谓词为核心的块依存语法构建块依存树库,在句内和句间寻找谓词所支配的组块,利用汉语中组块和组块间的依存关系补全缺省部分,明确谓词支配关系。目前共标注2199篇文本,涵盖百科、新闻两个领域,共约187万字语料。本文简述了块依存语法的原则,并对组块及其依存关系进行了定义。将详细介绍标注流程、标注一致率、数据分布等情况。基于现有的树库,本研究发现汉语中有约25%的小句是非自足的,约有88%的核心谓词可支配1~3个从属成分。- Anthology ID:
- 2020.ccl-1.53
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 572–580
- Language:
- Chinese
- URL:
- https://aclanthology.org/2020.ccl-1.53
- DOI:
- Cite (ACL):
- Qingqing Qian and Chengwen Wang. 2020. 汉语块依存语法与树库构建(Chinese Chunk-Based Dependency Grammar and Treebank construction). In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 572–580, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 汉语块依存语法与树库构建(Chinese Chunk-Based Dependency Grammar and Treebank construction) (Qian & Wang, CCL 2020)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2020.ccl-1.53.pdf