Haihua Xie


2020

pdf bib
基于数据增强和多任务特征学习的中文语法错误检测方法(Chinese Grammar Error Detection based on Data Enhancement and Multi-task Feature Learning)
Haihua Xie (谢海华) | Zhiyou Chen (陈志优) | Jing Cheng (程静) | Xiaoqing Lyu (吕肖庆) | Zhi Tang (汤帜)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

由于中文语法的复杂性,中文语法错误检测(CGED)的难度较大,而训练语料和相关研究的缺乏,使得CGED的效果还远达不到能够实用的程度。本文提出一种CGED模型,采用数据增强、预训练语言模型和基于语言学特征多任务学习的方式,弥补训练语料稀缺的不足。数据增强能够有效地扩充训练集,预训练语言模型蕴含丰富的语义信息有助于语法分析,基于语言学特征多任务学习对语言模型进行微调则可以使语言模型学习到跟语法错误检测相关的语言学特征。本文提出的方法在NLPTEA的CGED数据集进行测试,取得了优于其他模型的结果。

pdf bib
Combining Impression Feature Representation for Multi-turn Conversational Question Answering
Shaoling Jing | Shibo Hong | Dongyan Zhao | Haihua Xie | Zhi Tang
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Multi-turn conversational Question Answering (ConvQA) is a practical task that requires the understanding of conversation history, such as previous QA pairs, the passage context, and current question. It can be applied to a variety of scenarios with human-machine dialogue. The major challenge of this task is to require the model to consider the relevant conversation history while understanding the passage. Existing methods usually simply prepend the history to the current question, or use the complicated mechanism to model the history. This article proposes an impression feature, which use the word-level inter attention mechanism to learn multi-oriented information from conversation history to the input sequence, including attention from history tokens to each token of the input sequence, and history turn inter attention from different history turns to each token of the input sequence, and self-attention within input sequence, where the input sequence contains a current question and a passage. Then a feature selection method is designed to enhance the useful history turns of conversation and weaken the unnecessary information. Finally, we demonstrate the effectiveness of the proposed method on the QuAC dataset, analyze the impact of different feature selection methods, and verify the validity and reliability of the proposed features through visualization and human evaluation.

pdf bib
A Novel Joint Framework for Multiple Chinese Events Extraction
Nuo Xu | Haihua Xie | Dongyan Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Event extraction is an essential yet challenging task in information extraction. Previous approaches have paid little attention to the problem of roles overlap which is a common phenomenon in practice. To solve this problem, this paper defines event relation triple to explicitly represent relations among triggers, arguments and roles which are incorporated into the model to learn their inter-dependencies. The task of argument extraction is converted to event relation triple extraction. A novel joint framework for multiple Chinese event extraction is proposed which jointly performs predictions for event triggers and arguments based on shared feature representations from pre-trained language model. Experimental comparison with state-of-the-art baselines on ACE 2005 dataset shows the superiority of the proposed method in both trigger classification and argument classification.