Qin Chen (陈琴) - ACL Anthology

Qin Chen

2021

pdf bib abs
基于多质心异质图学习的社交网络用户建模(User Representation Learning based on Multi-centroid Heterogeneous Graph Neural Networks)
Shangyi Ning (宁上毅) | Guanying Li (李冠颖) | Qin Chen (陈琴) | Zengfeng Huang (黄增峰) | Baohua Zhou (周葆华) | Zhongyu Wei (魏忠钰)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

“用户建模已经引起了学术界和工业界的广泛关注。现有的方法大多侧重于将用户间的人际关系融入社区,而用户生成的内容(如帖子)却没有得到很好的研究。在本文中,我们通过实际舆情传播相关的分析表明,在舆情传播过程中对用户属性进行研究的重要作用,并且提出了用户资料数据的筛选方法。同时,我们提出了一种通过异构多质心图池为用户捕获更多不同社区特征的建模。我们首先构造了一个由用户和关键字组成的异质图,并在其上采用了一个异质图神经网络。为了方便用户建模的图表示,提出了一种多质心图池化机制,将多质心的集群特征融入到表示学习中。在三个基准数据集上的大量实验表明了该方法的有效性。”

pdf bib
Attending via both Fine-tuning and Compressing
Jie Zhou | Yuanbin Wu | Qin Chen | Xuanjing Huang | Liang He
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

This paper presents our endeavor for solving task11, NLPContributionGraph, of SemEval-2021. The purpose of the task was to extract triples from a paper in the Nature Language Processing field for constructing an Open Research Knowledge Graph. The task includes three sub-tasks: detecting the contribution sentences in papers, identifying scientific terms and predicate phrases from the contribution sentences; and inferring triples in the form of (subject, predicate, object) as statements for Knowledge Graph building. In this paper, we apply an ensemble of various fine-tuned pre-trained language models (PLM) for tasks one and two. In addition, self-training methods are adopted for tackling the shortage of annotated data. For the third task, rather than using classic neural open information extraction (OIE) architectures, we generate potential triples via manually designed rules and develop a binary classifier to differentiate positive ones from others. The quantitative results show that we obtain the 4th, 2nd, and 2nd rank in three evaluation phases.

2020

pdf bib abs
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition
Yun He | Ziwei Zhu | Yin Zhang | Qin Chen | James Caverlee
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Knowledge of a disease includes information of various aspects of the disease, such as signs and symptoms, diagnosis and treatment. This disease knowledge is critical for many health-related and biomedical tasks, including consumer health question answering, medical language inference and disease name recognition. While pre-trained language models like BERT have shown success in capturing syntactic, semantic, and world knowledge from text, we find they can be further complemented by specific information like knowledge of symptoms, diagnoses, treatments, and other disease aspects. Hence, we integrate BERT with disease knowledge for improving these important tasks. Specifically, we propose a new disease knowledge infusion training procedure and evaluate it on a suite of BERT models including BERT, BioBERT, SciBERT, ClinicalBERT, BlueBERT, and ALBERT. Experiments over the three tasks show that these models can be enhanced in nearly all cases, demonstrating the viability of disease knowledge infusion. For example, accuracy of BioBERT on consumer health question answering is improved from 68.29% to 72.09%, while new SOTA results are observed in two datasets. We make our data and code freely available.

Terms contained in Gene Ontology (GO) have been widely used in biology and bio-medicine. Most previous research focuses on inferring new GO terms, while the term names that reflect the gene function are still named by the experts. To fill this gap, we propose a novel task, namely term name generation for GO, and build a large-scale benchmark dataset. Furthermore, we present a graph-based generative model that incorporates the relations between genes, words and terms for term name generation, which exhibits great advantages over the strong baselines.

2019

pdf bib abs
Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph
Xinzhu Lin | Xiahui He | Qin Chen | Huaixiao Tou | Zhongyu Wei | Ting Chen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Symptom diagnosis is a challenging yet profound problem in natural language processing. Most previous research focus on investigating the standard electronic medical records for symptom diagnosis, while the dialogues between doctors and patients that contain more rich information are not well studied. In this paper, we first construct a dialogue symptom diagnosis dataset based on an online medical forum with a large amount of dialogues between patients and doctors. Then, we provide some benchmark models on this dataset to boost the research of dialogue symptom diagnosis. In order to further enhance the performance of symptom diagnosis over dialogues, we propose a global attention mechanism to capture more symptom related information, and build a symptom graph to model the associations between symptoms rather than treating each symptom independently. Experimental results show that both the global attention and symptom graph are effective to boost dialogue symptom diagnosis. In particular, our proposed model achieves the state-of-the-art performance on the constructed dataset.