Dawei Lu


2021

pdf
Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment
Xinying Qiu | Yuan Chen | Hanwu Chen | Jian-Yun Nie | Yuming Shen | Dawei Lu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Deep learning models for automatic readability assessment generally discard linguistic features traditionally used in machine learning models for the task. We propose to incorporate linguistic features into neural network models by learning syntactic dense embeddings based on linguistic features. To cope with the relationships between the features, we form a correlation graph among features and use it to learn their embeddings so that similar features will be represented by similar embeddings. Experiments with six data sets of two proficiency levels demonstrate that our proposed methodology can complement BERT-only model to achieve significantly better performances for automatic readability assessment.

2020

pdf
新支话题的句法成分和语义角色研究(A Study of Syntactic Constituent and Semantic Role of New Branch Topic)
Dawei Lu (卢达威)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

话题的延续和转换是篇章中重要的语用功能。本文从句首话题共享的角度对话题延续和转换进行了分类,分为句首话题延续、句中子话题延续、完全话题转换、兼语话题转换、新支话题转换五种,进而对话题转换的特殊情况——新支话题展开研究。基于33万字的广义话题结构语料库,本文对新支话题的句法成分、语义角色进行了统计和分析,发现能够成为新支话题的成分绝大多数是有关具体事物的体词性短语;句法成分方面,宾语从句或补语从句主语、主谓谓语句小主语、状语起始句的主语、句末宾语、连谓句非句末宾语、兼语句兼语、介词宾语甚至状语等都能成为新支话题,从而引出新支句,其中句末宾语做新支话题的情况最多,但未发现间接宾语作为新支话题的情况;语义角色方面,大部分主体论元(施事、感事、经事、主事)和客体论元(受事、系事、结果、对象、与事),及少数凭借论元(工具、方式,材料)和环境论元(处所、终点、路径)能成为新支话题引出新支句。其中,系事和受事成为新支话题情况最显著,施事、结果和对象次之。本文的研究揭示了句法、语义对话题转换这一语用现象的一种可能的约束途径。这将有助于人和计算机更深入地理解汉语篇章的话题转换机制,以期将这种语用现象逐步落实到语义直至句法的形式中,最终实现计算机对话题转换的自动分析。