Kang Liu


2023

pdf
Learning with Partial Annotations for Event Detection
Jian Liu | Dianbo Sui | Kang Liu | Haoyan Liu | Zhe Zhao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Event detection (ED) seeks to discover and classify event instances in plain texts.Previous methods for ED typically adopt supervised learning, requiring fully labeled and high-quality training data.However, in a real-world application, we may not obtain clean training data but only partially labeled one, which could substantially impede the learning process.In this work, we conduct a seminal study for learning with partial annotations for ED.We propose a new trigger localization formulation using contrastive learning to distinguish ground-truth triggers from contexts, showing a decent robustness for addressing partial annotation noise.Impressively, in an extreme scenario where more than 90% of events are unlabeled, our approach achieves an F1 score of over 60%.In addition, we re-annotate and make available two fully annotated subsets of ACE 2005 to serve as an unbiased benchmark for event detection.We hope our approach and data will inspire future studies on this vital yet understudied problem.

pdf
ParaLS: Lexical Substitution via Pretrained Paraphraser
Jipeng Qiang | Kang Liu | Yun Li | Yunhao Yuan | Yi Zhu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence’s meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence’s meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.

pdf
S3HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering
Fangyu Lei | Xiang Li | Yifan Wei | Shizhu He | Yiming Huang | Jun Zhao | Kang Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Answering multi-hop questions over hybrid factual knowledge from the given text and table (TextTableQA) is a challenging task. Existing models mainly adopt a retriever-reader framework, which have several deficiencies, such as noisy labeling in training retriever, insufficient utilization of heterogeneous information over text and table, and deficient ability for different reasoning operations. In this paper, we propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner. We use a retriever with refinement training to solve the noisy labeling problem. Then, a hybrid selector considers the linked relationships between heterogeneous data to select the most relevant factual knowledge. For the final stage, instead of adapting a reading comprehension module like in previous methods, we employ a generation-based reasoner to obtain answers. This includes two approaches: a row-wise generator and an LLM prompting generator (first time used in this task). The experimental results demonstrate that our method achieves competitive results in the few-shot setting. When trained on the full dataset, our approach outperforms all baseline methods, ranking first on the HybridQA leaderboard.

pdf
Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model
Fei Xia | Yixuan Weng | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Taxonomies, which organize domain concepts into hierarchical structures, are crucial for building knowledge systems and downstream applications. As domain knowledge evolves, taxonomies need to be continuously updated to include new concepts.Previous approaches have mainly focused on adding concepts to the leaf nodes of the existing hierarchical tree, which does not fully utilize the taxonomy’s knowledge and is unable to update the original taxonomy structure (usually involving non-leaf nodes). In this paper, we propose a two-stage method called ATTEMPT for taxonomy completion. Our method inserts new concepts into the correct position by finding a parent node and labeling child nodes. Specifically, by combining local nodes with prompts to generate natural sentences, we take advantage of pre-trained language models for hypernym/hyponymy recognition. Experimental results on two public datasets (including six domains) show that ATTEMPT performs best on both taxonomy completion and extension tasks, surpassing existing methods.

pdf
Class Lifelong Learning for Intent Detection via Structure Consolidation Networks
Qingbin Liu | Yanchao Hao | Xiaolong Liu | Bo Li | Dianbo Sui | Shizhu He | Kang Liu | Jun Zhao | Xi Chen | Ningyu Zhang | Jiaoyan Chen
Findings of the Association for Computational Linguistics: ACL 2023

Intent detection, which estimates diverse intents behind user utterances, is an essential component of task-oriented dialogue systems. Previous intent detection models are usually trained offline, which can only handle predefined intent classes. In the real world, new intents may keep challenging deployed models. For example, with the prevalence of the COVID-19 pandemic, users may pose various issues related to the pandemic to conversational systems, which brings many new intents. A general intent detection model should be intelligent enough to continually learn new data and recognize new arriving intent classes. Therefore, this work explores Class Lifelong Learning for Intent Detection (CLL-ID), where the model continually learns new intent classes from new data while avoiding catastrophic performance degradation on old data. To this end, we propose a novel lifelong learning method, called Structure Consolidation Networks (SCN), which consists of structure-based retrospection and contrastive knowledge distillation to handle the problems of expression diversity and class imbalance in the CLL-ID task. In addition to formulating the new task, we construct 3 benchmarks based on 8 intent detection datasets. Experimental results demonstrate the effectiveness of SCN, which significantly outperforms previous lifelong learning methods on the three benchmarks.

pdf
Prediction and Calibration: Complex Reasoning over Knowledge Graph with Bi-directional Directed Acyclic Graph Neural Network
Yao Xu | Shizhu He | Li Cai | Kang Liu | Jun Zhao
Findings of the Association for Computational Linguistics: ACL 2023

Answering complex logical queries is a challenging task for knowledge graph (KG) reasoning.Recently, query embedding (QE) has been proposed to encode queries and entities into the same vector space, and obtain answers based on numerical computation. However, such models obtain the node representations of a query only based on its predecessor nodes, which ignore the information contained in successor nodes. In this paper, we proposed a Bi-directional Directed Acyclic Graph neural network (BiDAG) that splits the reasoning process into prediction and calibration. The joint probability of all nodes is considered by applying a graph neural network (GNN) to the query graph in the calibration process. By the prediction in the first layer and the calibration in deep layers of GNN, BiDAG can outperform previous QE based methods on FB15k, FB15k-237, and NELL995.

pdf
Interpreting Sentiment Composition with Latent Semantic Tree
Zhongtao Jiang | Yuanzhe Zhang | Cao Liu | Jiansong Chen | Jun Zhao | Kang Liu
Findings of the Association for Computational Linguistics: ACL 2023

As the key to sentiment analysis, sentiment composition considers the classification of a constituent via classifications of its contained sub-constituents and rules operated on them. Such compositionality has been widely studied previously in the form of hierarchical trees including untagged and sentiment ones, which are intrinsically suboptimal in our view. To address this, we propose semantic tree, a new tree form capable of interpreting the sentiment composition in a principled way. Semantic tree is a derivation of a context-free grammar (CFG) describing the specific composition rules on difference semantic roles, which is designed carefully following previous linguistic conclusions. However, semantic tree is a latent variable since there is no its annotation in regular datasets. Thus, in our method, it is marginalized out via inside algorithm and learned to optimize the classification performance. Quantitative and qualitative results demonstrate that our method not only achieves better or competitive results compared to baselines in the setting of regular and domain adaptation classification, and also generates plausible tree explanations.

pdf
Multilingual Knowledge Graph Completion from Pretrained Language Models with Knowledge Constraints
Ran Song | Shizhu He | Shengxiang Gao | Li Cai | Kang Liu | Zhengtao Yu | Jun Zhao
Findings of the Association for Computational Linguistics: ACL 2023

Multilingual Knowledge Graph Completion (mKGC) aim at solving queries in different languages by reasoning a tail entity thus improving multilingual knowledge graphs. Previous studies leverage multilingual pretrained language models (PLMs) and the generative paradigm to achieve mKGC. Although multilingual pretrained language models contain extensive knowledge of different languages, its pretraining tasks cannot be directly aligned with the mKGC tasks. Moreover, the majority of KGs and PLMs currently available exhibit a pronounced English-centric bias. This makes it difficult for mKGC to achieve good results, particularly in the context of low-resource languages. To overcome previous problems, this paper introduces global and local knowledge constraints for mKGC. The former is used to constrain the reasoning of answer entities , while the latter is used to enhance the representation of query contexts. The proposed method makes the pretrained model better adapt to the mKGC task. Experimental results on public datasets demonstrate that our method outperforms the previous SOTA on Hits@1 and Hits@10 by an average of 12.32% and 16.03%, which indicates that our proposed method has significant enhancement on mKGC.

pdf
EventOA: An Event Ontology Alignment Benchmark Based on FrameNet and Wikidata
Shaoru Guo | Chenhao Wang | Yubo Chen | Kang Liu | Ru Li | Jun Zhao
Findings of the Association for Computational Linguistics: ACL 2023

Event ontology provides a shared and formal specification about what happens in the real world and can benefit many natural language understanding tasks. However, the independent development of event ontologies often results in heterogeneous representations that raise the need for establishing alignments between semantically related events. There exists a series of works about ontology alignment (OA), but they only focus on the entity-based OA, and neglect the event-based OA. To fill the gap, we construct an Event Ontology Alignment (EventOA) dataset based on FrameNet and Wikidata, which consists of 900+ event type alignments and 8,000+ event argument alignments. Furthermore, we propose a multi-view event ontology alignment (MEOA) method, which utilizes description information (i.e., name, alias and definition) and neighbor information (i.e., subclass and superclass) to obtain richer representation of the event ontologies. Extensive experiments show that our MEOA outperforms the existing entity-based OA methods and can serve as a strong baseline for EventOA research.

pdf
A Hierarchical Explanation Generation Method Based on Feature Interaction Detection
Yiming Ju | Yuanzhe Zhang | Kang Liu | Jun Zhao
Findings of the Association for Computational Linguistics: ACL 2023

The opaqueness of deep NLP models has motivated efforts to explain how deep models predict. Recently, work has introduced hierarchical attribution explanations, which calculate attribution scores for compositional text hierarchically to capture compositional semantics. Existing work on hierarchical attributions tends to limit the text groups to a continuous text span, which we call the connecting rule. While easy for humans to read, limiting the attribution unit to a continuous span might lose important long-distance feature interactions for reflecting model predictions. In this work, we introduce a novel strategy for capturing feature interactions and employ it to build hierarchical explanations without the connecting rule. The proposed method can convert ubiquitous non-hierarchical explanations (e.g., LIME) into their corresponding hierarchical versions. Experimental results show the effectiveness of our approach in building high-quality hierarchical explanations.

2022

pdf
Answering Numerical Reasoning Questions in Table-Text Hybrid Contents with Graph-based Encoder and Tree-based Decoder
Fangyu Lei | Shizhu He | Xiang Li | Jun Zhao | Kang Liu
Proceedings of the 29th International Conference on Computational Linguistics

pdf
CMQA: A Dataset of Conditional Question Answering with Multiple-Span Answers
Yiming Ju | Weikang Wang | Yuanzhe Zhang | Suncong Zheng | Kang Liu | Jun Zhao
Proceedings of the 29th International Conference on Computational Linguistics

Forcing the answer of the Question Answering (QA) task to be a single text span might be restrictive since the answer can be multiple spans in the context. Moreover, we found that multi-span answers often appear with two characteristics when building the QA system for a real-world application. First, multi-span answers might be caused by users lacking domain knowledge and asking ambiguous questions, which makes the question need to be answered with conditions. Second, there might be hierarchical relations among multiple answer spans. Some recent span-extraction QA datasets include multi-span samples, but they only contain unconditional and parallel answers, which cannot be used to tackle this problem. To bridge the gap, we propose a new task: conditional question answering with hierarchical multi-span answers, where both the hierarchical relations and the conditions need to be extracted. Correspondingly, we introduce CMQA, a Conditional Multiple-span Chinese Question Answering dataset to study the new proposed task. The final release of CMQA consists of 7,861 QA pairs and 113,089 labels, where all samples contain multi-span answers, 50.4% of samples are conditional, and 56.6% of samples are hierarchical. CMQA can serve as a benchmark to study the new proposed task and help study building QA systems for real-world applications. The low performance of models drawn from related literature shows that the new proposed task is challenging for the community to solve.

pdf
Augmentation, Retrieval, Generation: Event Sequence Prediction with a Three-Stage Sequence-to-Sequence Approach
Bo Zhou | Chenhao Wang | Yubo Chen | Kang Liu | Jun Zhao | Jiexin Xu | Xiaojian Jiang | Qiuxia Li
Proceedings of the 29th International Conference on Computational Linguistics

Being able to infer possible events related to a specific target is critical to natural language processing. One challenging task in this line is event sequence prediction, which aims at predicting a sequence of events given a goal. Currently existing approach models this task as a statistical induction problem, to predict a sequence of events by exploring the similarity between the given goal and the known sequences of events. However, this statistical based approach is complex and predicts a limited variety of events. At the same time this approach ignores the rich knowledge of external events that is important for predicting event sequences. In this paper, in order to predict more diverse events, we first reformulate the event sequence prediction problem as a sequence generation problem. Then to leverage external event knowledge, we propose a three-stage model including augmentation, retrieval and generation. Experimental results on the event sequence prediction dataset show that our model outperforms existing methods, demonstrating the effectiveness of the proposed model.

pdf
Generating Temporally-ordered Event Sequences via Event Optimal Transport
Bo Zhou | Yubo Chen | Kang Liu | Jun Zhao | Jiexin Xu | Xiaojian Jiang | Qiuxia Li
Proceedings of the 29th International Conference on Computational Linguistics

Generating temporally-ordered event sequences in texts is important to natural language processing. Two emerging tasks in this direction are temporal event ordering (rearranging the set of events to correct order) and event infilling (generating an event at a specified position). To tackle the two related tasks, the existing method adopts a vanilla sequence-to-sequence model via maximum likelihood estimation (MLE). However, applying this approach to these tasks will cause two issues. One issue is that the MLE loss emphasizes strict local alignment and ignores the global semantics of the event. The other issue is that the model adopts a word-level objective to model events in texts, failing to evaluate the predicted results of the model from the perspective of event sequence. To alleviate these issues, we present a novel model to tackle the generation of temporally-ordered event sequences via Event Optimal Transport (EOT). First, we treat the events in the sequence as modeling units and explicitly extract the semantics of the events. Second, to provide event sequence-level evaluation of the predicted results of the model, we directly match events in sequences. Extensive experimental results show that our approach outperforms previous models on all evaluation datasets. In particular, the accuracy is improved by 7.7%, and the Macro F1 is improved by 7.2% on one of the datasets.

pdf
Decoupling Mixture-of-Graphs: Unseen Relational Learning for Knowledge Graph Completion by Fusing Ontology and Textual Experts
Ran Song | Shizhu He | Suncong Zheng | Shengxiang Gao | Kang Liu | Zhengtao Yu | Jun Zhao
Proceedings of the 29th International Conference on Computational Linguistics

Knowledge Graph Embedding (KGE) has been proposed and successfully utilized to knowledge Graph Completion (KGC). But classic KGE paradigm often fail in unseen relation representations. Previous studies mainly utilize the textual descriptions of relations and its neighbor relations to represent unseen relations. In fact, the semantics of a relation can be expressed by three kinds of graphs: factual graph, ontology graph, textual description graph, and they can complement each other. A more common scenario in the real world is that seen and unseen relations appear at the same time. In this setting, the training set (only seen relations) and testing set (both seen and unseen relations) own different distributions. And the train-test inconsistency problem will make KGE methods easiy overfit on seen relations and under-performance on unseen relations. In this paper, we propose decoupling mixture-of-graph experts (DMoG) for unseen relations learning, which could represent the unseen relations in the factual graph by fusing ontology and textual graphs, and decouple fusing space and reasoning space to alleviate overfitting for seen relations. The experiments on two unseen only public datasets and a mixture dataset verify the effectiveness of the proposed method, which improves the state-of-the-art methods by 6.84% in Hits@10 on average.

pdf
Document-Level Relation Extraction via Pair-Aware and Entity-Enhanced Representation Learning
Xiusheng Huang | Hang Yang | Yubo Chen | Jun Zhao | Kang Liu | Weijian Sun | Zuyu Zhao
Proceedings of the 29th International Conference on Computational Linguistics

Document-level relation extraction aims to recognize relations among multiple entity pairs from a whole piece of article. Recent methods achieve considerable performance but still suffer from two challenges: a) the relational entity pairs are sparse, b) the representation of entity pairs is insufficient. In this paper, we propose Pair-Aware and Entity-Enhanced(PAEE) model to solve the aforementioned two challenges. For the first challenge, we design a Pair-Aware Representation module to predict potential relational entity pairs, which constrains the relation extraction to the predicted entity pairs subset rather than all pairs; For the second, we introduce a Entity-Enhanced Representation module to assemble directional entity pairs and obtain a holistic understanding of the entire document. Experimental results show that our approach can obtain state-of-the-art performance on four benchmark datasets DocRED, DWIE, CDR and GDA.

pdf
LingJing at SemEval-2022 Task 3: Applying DeBERTa to Lexical-level Presupposed Relation Taxonomy with Knowledge Transfer
Fei Xia | Bin Li | Yixuan Weng | Shizhu He | Bin Sun | Shutao Li | Kang Liu | Jun Zhao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper presents the results and main findings of our system on SemEval-2022 Task 3 Presupposed Taxonomies: Evaluating Neural Network Semantics (PreTENS). This task aims at semantic competence with specific attention on the evaluation of language models, which is a task with respect to the recognition of appropriate taxonomic relations between two nominal arguments. Two sub-tasks including binary classification and regression are designed for the evaluation. For the classification sub-task, we adopt the DeBERTa-v3 pre-trained model for fine-tuning datasets of different languages. Due to the small size of the training datasets of the regression sub-task, we transfer the knowledge of classification model (i.e., model parameters) to the regression task. The experimental results show that the proposed method achieves the best results on both sub-tasks. Meanwhile, we also report negative results of multiple training strategies for further discussion. All the experimental codes are open-sourced at https://github.com/WENGSYX/Semeval.

pdf
CASIA at SemEval-2022 Task 11: Chinese Named Entity Recognition for Complex and Ambiguous Entities
Jia Fu | Zhen Gan | Zhucong Li | Sirui Li | Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes our approach to develop a complex named entity recognition system in SemEval 2022 Task 11: MultiCoNER Multilingual Complex Named Entity Recognition,Track 9 - Chinese. In this task, we need to identify the entity boundaries and categorylabels for the six identified categories of CW,LOC, PER, GRP, CORP, and PORD.The task focuses on detecting semantically ambiguous and complex entities in short and low-context settings. We constructed a hybrid system based on Roberta-large model with three training mechanisms and a series of data gugmentation.Three training mechanisms include adversarial training, Child-Tuning training, and continued pre-training. The core idea of the hybrid system is to improve the performance of the model in complex environments by introducing more domain knowledge through data augmentation and continuing pre-training domain adaptation of the model. Our proposed method in this paper achieves a macro-F1 of 0.797 on the final test set, ranking second.

pdf
Incremental Intent Detection for Medical Domain with Contrast Replay Networks
Guirong Bai | Shizhu He | Kang Liu | Jun Zhao
Findings of the Association for Computational Linguistics: ACL 2022

Conventional approaches to medical intent detection require fixed pre-defined intent categories. However, due to the incessant emergence of new medical intents in the real world, such requirement is not practical. Considering that it is computationally expensive to store and re-train the whole data every time new data and intents come in, we propose to incrementally learn emerged intents while avoiding catastrophically forgetting old intents. We first formulate incremental learning for medical intent detection. Then, we employ a memory-based method to handle incremental learning. We further propose to enhance the method with contrast replay networks, which use multilevel distillation and contrast objective to address training data imbalance and medical rare words respectively. Experiments show that the proposed method outperforms the state-of-the-art model by 5.7% and 9.1% of accuracy on two benchmarks respectively.

pdf
CASIA@SMM4H’22: A Uniform Health Information Mining System for Multilingual Social Media Texts
Jia Fu | Sirui Li | Hui Ming Yuan | Zhucong Li | Zhen Gan | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper presents a description of our system in SMM4H-2022, where we participated in task 1a,task 4, and task 6 to task 10. There are three main challenges in SMM4H-2022, namely the domain shift problem, the prediction bias due to category imbalance, and the noise in informal text. In this paper, we propose a unified framework for the classification and named entity recognition tasks to solve the challenges, and it can be applied to both English and Spanish scenarios. The results of our system are higher than the median F1-scores for 7 tasks and significantly exceed the F1-scores for 6 tasks. The experimental results demonstrate the effectiveness of our system.

pdf
Logic Traps in Evaluating Attribution Scores
Yiming Ju | Yuanzhe Zhang | Zhao Yang | Zhongtao Jiang | Kang Liu | Jun Zhao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Modern deep learning models are notoriously opaque, which has motivated the development of methods for interpreting how deep models predict.This goal is usually approached with attribution method, which assesses the influence of features on model predictions. As an explanation method, the evaluation criteria of attribution methods is how accurately it reflects the actual reasoning process of the model (faithfulness). Meanwhile, since the reasoning process of deep models is inaccessible, researchers design various evaluation methods to demonstrate their arguments.However, some crucial logic traps in these evaluation methods are ignored in most works, causing inaccurate evaluation and unfair comparison.This paper systematically reviews existing methods for evaluating attribution scores and summarizes the logic traps in these methods.We further conduct experiments to demonstrate the existence of each logic trap.Through both theoretical and experimental analysis, we hope to increase attention on the inaccurate evaluation of attribution scores. Moreover, with this paper, we suggest stopping focusing on improving performance under unreliable evaluation systems and starting efforts on reducing the impact of proposed logic traps.

pdf
Leveraging Explicit Lexico-logical Alignments in Text-to-SQL Parsing
Runxin Sun | Shizhu He | Chong Zhu | Yaohan He | Jinlong Li | Jun Zhao | Kang Liu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Text-to-SQL aims to parse natural language questions into SQL queries, which is valuable in providing an easy interface to access large databases. Previous work has observed that leveraging lexico-logical alignments is very helpful to improve parsing performance. However, current attention-based approaches can only model such alignments at the token level and have unsatisfactory generalization capability. In this paper, we propose a new approach to leveraging explicit lexico-logical alignments. It first identifies possible phrase-level alignments and injects them as additional contexts to guide the parsing procedure. Experimental results on Squall show that our approach can make better use of such alignments and obtains an absolute improvement of 3.4% compared with the current state-of-the-art.

pdf
A Good Neighbor, A Found Treasure: Mining Treasured Neighbors for Knowledge Graph Entity Typing
Zhuoran Jin | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

The task of knowledge graph entity typing (KGET) aims to infer the missing types for entities in knowledge graphs. Some pioneering work has proved that neighbor information is very important for the task. However, existing methods only leverage the one-hop neighbor information of the central entity, ignoring the multi-hop neighbor information that can provide valuable clues for inference. Besides, we also observe that there are co-occurrence relations between types, which is very helpful to alleviate false-negative problem. In this paper, we propose a novel method called Mining Treasured Neighbors (MiNer) to make use of these two characteristics. Firstly, we devise a Neighbor Information Aggregation module to aggregate the neighbor information. Then, we propose an Entity Type Inference module to mitigate the adverse impact of the irrelevant neighbor information. Finally, a Type Co-occurrence Regularization module is designed to prevent the model from overfitting the false negative examples caused by missing types. Experimental results on two widely used datasets indicate that our approach significantly outperforms previous state-of-the-art methods.

pdf
CN-AutoMIC: Distilling Chinese Commonsense Knowledge from Pretrained Language Models
Chenhao Wang | Jiachun Li | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Commonsense knowledge graphs (CKGs) are increasingly applied in various natural language processing tasks. However, most existing CKGs are limited to English, which hinders related research in non-English languages. Meanwhile, directly generating commonsense knowledge from pretrained language models has recently received attention, yet it has not been explored in non-English languages. In this paper, we propose a large-scale Chinese CKG generated from multilingual PLMs, named as **CN-AutoMIC**, aiming to fill the research gap of non-English CKGs. To improve the efficiency, we propose generate-by-category strategy to reduce invalid generation. To ensure the filtering quality, we develop cascaded filters to discard low-quality results. To further increase the diversity and density, we introduce a bootstrapping iteration process to reuse generated results. Finally, we conduct detailed analyses on CN-AutoMIC from different aspects. Empirical results show the proposed CKG has high quality and diversity, surpassing the direct translation version of similar English CKGs. We also find some interesting deficiency patterns and differences between relations, which reveal pending problems in commonsense knowledge generation. We share the resources and related models for further study.

pdf bib
CogKTR: A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding
Zhuoran Jin | Tianyi Men | Hongbang Yuan | Yuyang Zhou | Pengfei Cao | Yubo Chen | Zhipeng Xue | Kang Liu | Jun Zhao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

As the first step of modern natural language processing, text representation encodes discrete texts as continuous embeddings. Pre-trained language models (PLMs) have demonstrated strong ability in text representation and significantly promoted the development of natural language understanding (NLU). However, existing PLMs represent a text solely by its context, which is not enough to support knowledge-intensive NLU tasks. Knowledge is power, and fusing external knowledge explicitly into PLMs can provide knowledgeable text representations. Since previous knowledge-enhanced methods differ in many aspects, making it difficult for us to reproduce previous methods, implement new methods, and transfer between different methods. It is highly desirable to have a unified paradigm to encompass all kinds of methods in one framework. In this paper, we propose CogKTR, a knowledge-enhanced text representation toolkit for natural language understanding. According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks. Our unified, knowledgeable and modular toolkit is publicly available at GitHub, with an online system and a short instruction video.

pdf
MedConQA: Medical Conversational Question Answering System based on Knowledge Graphs
Fei Xia | Bin Li | Yixuan Weng | Shizhu He | Kang Liu | Bin Sun | Shutao Li | Jun Zhao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

The medical conversational system can relieve doctors’ burden and improve healthcare efficiency, especially during the COVID-19 pandemic. However, the existing medical dialogue systems have the problems of weak scalability, insufficient knowledge, and poor controllability. Thus, we propose a medical conversational question-answering (CQA) system based on the knowledge graph, namely MedConQA, which is designed as a pipeline framework to maintain high flexibility. Our system utilizes automated medical procedures, including medical triage, consultation, image-text drug recommendation, and record. Each module has been open-sourced as a tool, which can be used alone or in combination, with robust scalability. Besides, to conduct knowledge-grounded dialogues with users, we first construct a Chinese Medical Knowledge Graph (CMKG) and collect a large-scale Chinese Medical CQA (CMCQA) dataset, and we design a series of methods for reasoning more intellectually. Finally, we use several state-of-the-art (SOTA) techniques to keep the final generated response more controllable, which is further assured by hospital and professional evaluations. We have open-sourced related code, datasets, web pages, and tools, hoping to advance future research.

pdf bib
Knowledge Transfer with Visual Prompt in multi-modal Dialogue Understanding and Generation
Minjun Zhu | Yixuan Weng | Bin Li | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the First Workshop On Transcript Understanding

Visual Dialogue (VD) task has recently received increasing attention in AI research. Visual Dialog aims to generate multi-round, interactive responses based on the dialog history and image content. Existing textual dialogue models cannot fully understand visual information, resulting in a lack of scene features when communicating with humans continuously. Therefore, how to efficiently fuse multimodal data features remains to be a challenge. In this work, we propose a knowledge transfer method with visual prompt (VPTG) fusing multi-modal data, which is a flexible module that can utilize the text-only seq2seq model to handle visual dialogue tasks. The VPTG conducts text-image co-learning and multi-modal information fusion with visual prompts and visual knowledge distillation. Specifically, we construct visual prompts from visual representations and then induce sequence-to-sequence(seq2seq) models to fuse visual information and textual contexts by visual-text patterns. And we also realize visual knowledge transfer through distillation between two different models’ text representations, so that the seq2seq model can actively learn visual semantic representations. Extensive experiments on the multi-modal dialogue understanding and generation (MDUG) datasets show the proposed VPTG outperforms other single-modal methods, which demonstrate the effectiveness of visual prompt and visual knowledge transfer.

2021

pdf
Probing into the Root: A Dataset for Reason Extraction of Structural Events from Financial Documents
Pei Chen | Kang Liu | Yubo Chen | Taifeng Wang | Jun Zhao
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This paper proposes a new task regarding event reason extraction from document-level texts. Unlike the previous causality detection task, we do not assign target events in the text, but only provide structural event descriptions, and such settings accord more with practice scenarios. Moreover, we annotate a large dataset FinReason for evaluation, which provides Reasons annotation for Financial events in company announcements. This task is challenging because the cases of multiple-events, multiple-reasons, and implicit-reasons are included. In total, FinReason contains 8,794 documents, 12,861 financial events and 11,006 reason spans. We also provide the performance of existing canonical methods in event extraction and machine reading comprehension on this task. The results show a 7 percentage point F1 score gap between the best model and human performance, and existing methods are far from resolving this problem.

pdf
Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement
Xinyu Zuo | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Weihua Peng | Yuguang Chen
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
Distantly Supervised Relation Extraction in Federated Settings
Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao
Findings of the Association for Computational Linguistics: EMNLP 2021

In relation extraction, distant supervision is widely used to automatically label a large-scale training dataset by aligning a knowledge base with unstructured text. Most existing studies in this field have assumed there is a great deal of centralized unstructured text. However, in practice, texts are usually distributed on different platforms and cannot be centralized due to privacy restrictions. Therefore, it is worthwhile to investigate distant supervision in the federated learning paradigm, which decouples the training of the model from the need for direct access to raw texts. However, overcoming label noise of distant supervision becomes more difficult in federated settings, because texts containing the same entity pair scatter around different platforms. In this paper, we propose a federated denoising framework to suppress label noise in federated settings. The key of this framework is a multiple instance learning based denoising method that is able to select reliable sentences via cross-platform collaboration. Various experiments on New York Times dataset and miRNA gene regulation relation dataset demonstrate the effectiveness of the proposed method.

pdf
Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks
Qingbin Liu | Pengfei Cao | Cao Liu | Jiansong Chen | Xunliang Cai | Fan Yang | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Dialogue state tracking (DST), which estimates user goals given a dialogue context, is an essential component of task-oriented dialogue systems. Conventional DST models are usually trained offline, which requires a fixed dataset prepared in advance. This paradigm is often impractical in real-world applications since online dialogue systems usually involve continually emerging new data and domains. Therefore, this paper explores Domain-Lifelong Learning for Dialogue State Tracking (DLL-DST), which aims to continually train a DST model on new data to learn incessantly emerging new domains while avoiding catastrophically forgetting old learned domains. To this end, we propose a novel domain-lifelong learning method, called Knowledge Preservation Networks (KPN), which consists of multi-prototype enhanced retrospection and multi-strategy knowledge distillation, to solve the problems of expression diversity and combinatorial explosion in the DLL-DST task. Experimental results show that KPN effectively alleviates catastrophic forgetting and outperforms previous state-of-the-art lifelong learning methods by 4.25% and 8.27% of whole joint goal accuracy on the MultiWOZ benchmark and the SGD benchmark, respectively.

pdf
Uncertain Local-to-Global Networks for Document-Level Event Factuality Identification
Pengfei Cao | Yubo Chen | Yuqing Yang | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Event factuality indicates the degree of certainty about whether an event occurs in the real world. Existing studies mainly focus on identifying event factuality at sentence level, which easily leads to conflicts between different mentions of the same event. To this end, we study the problem of document-level event factuality identification, which determines the event factuality from the view of a document. For this task, we need to consider two important characteristics: Local Uncertainty and Global Structure, which can be utilized to improve performance. In this paper, we propose an Uncertain Local-to-Global Network (ULGN) to make use of these two characteristics. Specifically, we devise a Local Uncertainty Estimation module to model the uncertainty of local information. Moreover, we propose an Uncertain Information Aggregation module to leverage the global structure for integrating the local information. Experimental results demonstrate the effectiveness of our proposed method, outperforming the previous state-of-the-art model by 8.4% and 11.45% of F1 score on two widely used datasets.

pdf
Biomedical Concept Normalization by Leveraging Hypernyms
Cheng Yan | Yuanzhe Zhang | Kang Liu | Jun Zhao | Yafei Shi | Shengping Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Biomedical Concept Normalization (BCN) is widely used in biomedical text processing as a fundamental module. Owing to numerous surface variants of biomedical concepts, BCN still remains challenging and unsolved. In this paper, we exploit biomedical concept hypernyms to facilitate BCN. We propose Biomedical Concept Normalizer with Hypernyms (BCNH), a novel framework that adopts list-wise training to make use of both hypernyms and synonyms, and also employs norm constraint on the representation of hypernym-hyponym entity pairs. The experimental results show that BCNH outperforms the previous state-of-the-art model on the NCBI dataset.

pdf
Enhancing Multiple-choice Machine Reading Comprehension by Punishing Illogical Interpretations
Yiming Ju | Yuanzhe Zhang | Zhixing Tian | Kang Liu | Xiaohuan Cao | Wenting Zhao | Jinlong Li | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Machine Reading Comprehension (MRC), which requires a machine to answer questions given the relevant documents, is an important way to test machines’ ability to understand human language. Multiple-choice MRC is one of the most studied tasks in MRC due to the convenience of evaluation and the flexibility of answer format. Post-hoc interpretation aims to explain a trained model and reveal how the model arrives at the prediction. One of the most important interpretation forms is to attribute model decisions to input features. Based on post-hoc interpretation methods, we assess attributions of paragraphs in multiple-choice MRC and improve the model by punishing the illogical attributions. Our method can improve model performance without any external information and model structure change. Furthermore, we also analyze how and why such a self-training method works.

pdf
Set Generation Networks for End-to-End Knowledge Base Population
Dianbo Sui | Chenhao Wang | Yubo Chen | Kang Liu | Jun Zhao | Wei Bi
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

The task of knowledge base population (KBP) aims to discover facts about entities from texts and expand a knowledge base with these facts. Previous studies shape end-to-end KBP as a machine translation task, which is required to convert unordered fact into a sequence according to a pre-specified order. However, the facts stated in a sentence are unordered in essence. In this paper, we formulate end-to-end KBP as a direct set generation problem, avoiding considering the order of multiple facts. To solve the set generation problem, we propose networks featured by transformers with non-autoregressive parallel decoding. Unlike previous approaches that use an autoregressive decoder to generate facts one by one, the proposed networks can directly output the final set of facts in one shot. Furthermore, to train the networks, we also design a set-based loss that forces unique predictions via bipartite matching. Compared with cross-entropy loss that highly penalizes small shifts in fact order, the proposed bipartite matching loss is invariant to any permutation of predictions. Benefiting from getting rid of the burden of predicting the order of multiple facts, our proposed networks achieve state-of-the-art (SoTA) performance on two benchmark datasets.

pdf
CroAno : A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset
Baoli Zhang | Zhucong Li | Zhen Gan | Yubo Chen | Jing Wan | Kang Liu | Jun Zhao | Shengping Liu | Yafei Shi
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

In this paper, we introduce CroAno, a web-based crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.

pdf
A Large-Scale Chinese Multimodal NER Dataset with Speech Clues
Dianbo Sui | Zhengkun Tian | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we aim to explore an uncharted territory, which is Chinese multimodal named entity recognition (NER) with both textual and acoustic contents. To achieve this, we construct a large-scale human-annotated Chinese multimodal NER dataset, named CNERTA. Our corpus totally contains 42,987 annotated sentences accompanying by 71 hours of speech data. Based on this dataset, we propose a family of strong and representative baseline models, which can leverage textual features or multimodal features. Upon these baselines, to capture the natural monotonic alignment between the textual modality and the acoustic modality, we further propose a simple multimodal multitask model by introducing a speech-to-text alignment auxiliary task. Through extensive experiments, we observe that: (1) Progressive performance boosts as we move from unimodal to multimodal, verifying the necessity of integrating speech clues into Chinese NER. (2) Our proposed model yields state-of-the-art (SoTA) results on CNERTA, demonstrating its effectiveness. For further research, the annotated dataset is publicly available at http://github.com/DianboWork/CNERTA.

pdf
LearnDA: Learnable Knowledge-Guided Data Augmentation for Event Causality Identification
Xinyu Zuo | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Weihua Peng | Yuguang Chen
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Modern models for event causality identification (ECI) are mainly based on supervised learning, which are prone to the data lacking problem. Unfortunately, the existing NLP-related augmentation methods cannot directly produce available data required for this task. To solve the data lacking problem, we introduce a new approach to augment training data for event causality identification, by iteratively generating new examples and classifying event causality in a dual learning framework. On the one hand, our approach is knowledge guided, which can leverage existing knowledge bases to generate well-formed new sentences. On the other hand, our approach employs a dual mechanism, which is a learnable augmentation framework, and can interactively adjust the generation process to generate task-related sentences. Experimental results on two benchmarks EventStoryLine and Causal-TimeBank show that 1) our method can augment suitable task-related training data for ECI; 2) our method outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.5 and +2.1 points on F1 value respectively).

pdf
Knowledge-Enriched Event Causality Identification via Latent Structure Induction Networks
Pengfei Cao | Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao | Yuguang Chen | Weihua Peng
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Identifying causal relations of events is an important task in natural language processing area. However, the task is very challenging, because event causality is usually expressed in diverse forms that often lack explicit causal clues. Existing methods cannot handle well the problem, especially in the condition of lacking training data. Nonetheless, humans can make a correct judgement based on their background knowledge, including descriptive knowledge and relational knowledge. Inspired by it, we propose a novel Latent Structure Induction Network (LSIN) to incorporate the external structural knowledge into this task. Specifically, to make use of the descriptive knowledge, we devise a Descriptive Graph Induction module to obtain and encode the graph-structured descriptive knowledge. To leverage the relational knowledge, we propose a Relational Graph Induction module which is able to automatically learn a reasoning structure for event causality reasoning. Experimental results on two widely used datasets indicate that our approach significantly outperforms previous state-of-the-art methods.

pdf
Alignment Rationale for Natural Language Inference
Zhongtao Jiang | Yuanzhe Zhang | Zhao Yang | Jun Zhao | Kang Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Deep learning models have achieved great success on the task of Natural Language Inference (NLI), though only a few attempts try to explain their behaviors. Existing explanation methods usually pick prominent features such as words or phrases from the input text. However, for NLI, alignments among words or phrases are more enlightening clues to explain the model. To this end, this paper presents AREC, a post-hoc approach to generate alignment rationale explanations for co-attention based models in NLI. The explanation is based on feature selection, which keeps few but sufficient alignments while maintaining the same prediction of the target model. Experimental results show that our method is more faithful and human-readable compared with many existing approaches. We further study and re-evaluate three typical models through our explanation beyond accuracy, and propose a simple method that greatly improves the model robustness.

pdf
Automatic ICD Coding via Interactive Shared Representation Networks with Self-distillation Mechanism
Tong Zhou | Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Kun Niu | Weifeng Chong | Shengping Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The ICD coding task aims at assigning codes of the International Classification of Diseases in clinical notes. Since manual coding is very laborious and prone to errors, many methods have been proposed for the automatic ICD coding task. However, existing works either ignore the long-tail of code frequency or the noisy clinical notes. To address the above issues, we propose an Interactive Shared Representation Network with Self-Distillation Mechanism. Specifically, an interactive shared representation network targets building connections among codes while modeling the co-occurrence, consequently alleviating the long-tail problem. Moreover, to cope with the noisy text issue, we encourage the model to focus on the clinical note’s noteworthy part and extract valuable information through a self-distillation learning mechanism. Experimental results on two MIMIC datasets demonstrate the effectiveness of our method.

pdf
Document-level Event Extraction via Parallel Prediction Networks
Hang Yang | Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao | Taifeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level event extraction (DEE) is indispensable when events are described throughout a document. We argue that sentence-level extractors are ill-suited to the DEE task where event arguments always scatter across sentences and multiple events may co-exist in a document. It is a challenging task because it requires a holistic understanding of the document and an aggregated ability to assemble arguments across multiple sentences. In this paper, we propose an end-to-end model, which can extract structured events from a document in a parallel manner. Specifically, we first introduce a document-level encoder to obtain the document-aware representations. Then, a multi-granularity non-autoregressive decoder is used to generate events in parallel. Finally, to train the entire model, a matching loss function is proposed, which can bootstrap a global optimization. The empirical results on the widely used DEE dataset show that our approach significantly outperforms current state-of-the-art methods in the challenging DEE task. Code will be available at https://github.com/HangYang-NLP/DE-PPN.

pdf
Classification, Extraction, and Normalization : CASIA_Unisound Team at the Social Media Mining for Health 2021 Shared Tasks
Tong Zhou | Zhucong Li | Zhen Gan | Baoli Zhang | Yubo Chen | Kun Niu | Jing Wan | Kang Liu | Jun Zhao | Yafei Shi | Weifeng Chong | Shengping Liu
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

This is the system description of the CASIA_Unisound team for Task 1, Task 7b, and Task 8 of the sixth Social Media Mining for Health Applications (SMM4H) shared task in 2021. Targeting on deal with two shared challenges, the colloquial text and the imbalance annotation, among those tasks, we apply a customized pre-trained language model and propose various training strategies. Experimental results show the effectiveness of our system. Moreover, we got an F1-score of 0.87 in task 8, which is the highest among all participates.

pdf bib
Proceedings of the 20th Chinese National Conference on Computational Linguistics
Sheng Li (李生) | Maosong Sun (孙茂松) | Yang Liu (刘洋) | Hua Wu (吴华) | Kang Liu (刘康) | Wanxiang Che (车万翔) | Shizhu He (何世柱) | Gaoqi Rao (饶高琦)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

pdf
Knowledge Guided Metric Learning for Few-Shot Text Classification
Dianbo Sui | Yubo Chen | Binjie Mao | Delai Qiu | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Humans can distinguish new categories very efficiently with few examples, largely due to the fact that human beings can leverage knowledge obtained from relevant tasks. However, deep learning based text classification model tends to struggle to achieve satisfactory performance when labeled data are scarce. Inspired by human intelligence, we propose to introduce external knowledge into few-shot learning to imitate human knowledge. A novel parameter generator network is investigated to this end, which is able to use the external knowledge to generate different metrics for different tasks. Armed with this network, similar tasks can use similar metrics while different tasks use different metrics. Through experiments, we demonstrate that our method outperforms the SoTA few-shot text classification models.

2020

pdf
Towards Causal Explanation Detection with Pyramid Salient-Aware Network
Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Causal explanation analysis (CEA) can assist us to understand the reasons behind daily events, which has been found very helpful for understanding the coherence of messages. In this paper, we focus on Causal Explanation Detection, an important subtask of causal explanation analysis, which determines whether a causal explanation exists in one message. We design a Pyramid Salient-Aware Network (PSAN) to detect causal explanations on messages. PSAN can assist in causal explanation detection via capturing the salient semantics of discourses contained in their keywords with a bottom graph-based word-level salient network. Furthermore, PSAN can modify the dominance of discourses via a top attention-based discourse-level salient network to enhance explanatory semantics of messages. The experiments on the commonly used dataset of CEA shows that the PSAN outperforms the state-of-the-art method by 1.8% F1 value on the Causal Explanation Detection task.

pdf
Chinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Named entity recognition (NER) aims to identify text spans that mention named entities and classify them into pre-defined categories. For Chinese NER task, most of the existing methods are character-based sequence labeling models and achieve great success. However, these methods usually ignore lexical knowledge, which leads to false prediction of entity boundaries. Moreover, these methods have difficulties in capturing tag dependencies. In this paper, we propose an Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism (AMMNHT) to address all above problems. Specifically, to reduce the errors of predicting entity boundaries, we propose an adaptive multi-pass memory network to exploit lexical knowledge. In addition, we propose a hierarchical tagging layer to learn tag dependencies. Experimental results on three widely used Chinese NER datasets demonstrate that our proposed model significantly outperforms other state-of-the-art methods.

pdf
Reconstructing Event Regions for Event Extraction via Graph Attention Networks
Pei Chen | Hang Yang | Kang Liu | Ruihong Huang | Yubo Chen | Taifeng Wang | Jun Zhao
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Event information is usually scattered across multiple sentences within a document. The local sentence-level event extractors often yield many noisy event role filler extractions in the absence of a broader view of the document-level context. Filtering spurious extractions and aggregating event information in a document remains a challenging problem. Following the observation that a document has several relevant event regions densely populated with event role fillers, we build graphs with candidate role filler extractions enriched by sentential embeddings as nodes, and use graph attention networks to identify event regions in a document and aggregate event information. We characterize edges between candidate extractions in a graph into rich vector representations to facilitate event region identification. The experimental results on two datasets of two languages show that our approach yields new state-of-the-art performance for the challenging event extraction task.

pdf
Pre-trained Language Model Based Active Learning for Sentence Matching
Guirong Bai | Shizhu He | Kang Liu | Jun Zhao | Zaiqing Nie
Proceedings of the 28th International Conference on Computational Linguistics

Active learning is able to significantly reduce the annotation cost for data-driven techniques. However, previous active learning approaches for natural language processing mainly depend on the entropy-based uncertainty criterion, and ignore the characteristics of natural language. In this paper, we propose a pre-trained language model based active learning approach for sentence matching. Differing from previous active learning, it can provide linguistic criteria from the pre-trained language model to measure instances and help select more effective instances for annotation. Experiments demonstrate our approach can achieve greater accuracy with fewer labeled training instances.

pdf
KnowDis: Knowledge Enhanced Data Augmentation for Event Causality Detection via Distant Supervision
Xinyu Zuo | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 28th International Conference on Computational Linguistics

Modern models of event causality detection (ECD) are mainly based on supervised learning from small hand-labeled corpora. However, hand-labeled training data is expensive to produce, low coverage of causal expressions, and limited in size, which makes supervised methods hard to detect causal relations between events. To solve this data lacking problem, we investigate a data augmentation framework for ECD, dubbed as Knowledge Enhanced Distant Data Augmentation (KnowDis). Experimental results on two benchmark datasets EventStoryLine corpus and Causal-TimeBank show that 1) KnowDis can augment available training data assisted with the lexical and causal commonsense knowledge for ECD via distant supervision, and 2) our method outperforms previous methods by a large margin assisted with automatically labeled training data.

pdf
Graph-Based Knowledge Integration for Question Answering over Dialogue
Jian Liu | Dianbo Sui | Kang Liu | Jun Zhao
Proceedings of the 28th International Conference on Computational Linguistics

Question answering over dialogue, a specialized machine reading comprehension task, aims to comprehend a dialogue and to answer specific questions. Despite many advances, existing approaches for this task did not consider dialogue structure and background knowledge (e.g., relationships between speakers). In this paper, we introduce a new approach for the task, featured by its novelty in structuring dialogue and integrating background knowledge for reasoning. Specifically, different from previous “structure-less” approaches, our method organizes a dialogue as a “relational graph”, using edges to represent relationships between entities. To encode this relational graph, we devise a relational graph convolutional network (R-GCN), which can traverse the graph’s topological structure and effectively encode multi-relational knowledge for reasoning. The extensive experiments have justified the effectiveness of our approach over competitive baselines. Moreover, a deeper analysis shows that our model is better at tackling complex questions requiring relational reasoning and defending adversarial attacks with distracting sentences.

pdf
HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu | Weifeng Chong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The International Classification of Diseases (ICD) provides a standardized way for classifying diseases, which endows each disease with a unique code. ICD coding aims to assign proper ICD codes to a medical record. Since manual coding is very laborious and prone to errors, many methods have been proposed for the automatic ICD coding task. However, most of existing methods independently predict each code, ignoring two important characteristics: Code Hierarchy and Code Co-occurrence. In this paper, we propose a Hyperbolic and Co-graph Representation method (HyperCore) to address the above problem. Specifically, we propose a hyperbolic representation method to leverage the code hierarchy. Moreover, we propose a graph convolutional network to utilize the code co-occurrence. Experimental results on two widely used datasets demonstrate that our proposed model outperforms previous state-of-the-art methods.

pdf
Connecting Embeddings for Knowledge Graph Entity Typing
Yu Zhao | Anxiang Zhang | Ruobing Xie | Kang Liu | Xiaojie Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Knowledge graph (KG) entity typing aims at inferring possible missing entity type instances in KG, which is a very significant but still under-explored subtask of knowledge graph completion. In this paper, we propose a novel approach for KG entity typing which is trained by jointly utilizing local typing knowledge from existing entity type assertions and global triple knowledge in KGs. Specifically, we present two distinct knowledge-driven effective mechanisms of entity type inference. Accordingly, we build two novel embedding models to realize the mechanisms. Afterward, a joint model via connecting them is used to infer missing entity type instances, which favors inferences that agree with both entity type instances and triple knowledge in KGs. Experimental results on two real-world datasets (Freebase and YAGO) demonstrate the effectiveness of our proposed mechanisms and models for improving KG entity typing. The source code and data of this paper can be obtained from: https://github.com/Adam1679/ConnectE .

pdf
MIE: A Medical Information Extractor towards Medical Dialogues
Yuanzhe Zhang | Zhongtao Jiang | Tao Zhang | Shiwan Liu | Jiarun Cao | Kang Liu | Shengping Liu | Jun Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Electronic Medical Records (EMRs) have become key components of modern medical care systems. Despite the merits of EMRs, many doctors suffer from writing them, which is time-consuming and tedious. We believe that automatically converting medical dialogues to EMRs can greatly reduce the burdens of doctors, and extracting information from medical dialogues is an essential step. To this end, we annotate online medical consultation dialogues in a window-sliding style, which is much easier than the sequential labeling annotation. We then propose a Medical Information Extractor (MIE) towards medical dialogues. MIE is able to extract mentioned symptoms, surgeries, tests, other information and their corresponding status. To tackle the particular challenges of the task, MIE uses a deep matching architecture, taking dialogue turn-interaction into account. The experimental results demonstrate MIE is a promising solution to extract medical information from doctor-patient dialogues.

pdf
Clinical-Coder: Assigning Interpretable ICD-10 Codes to Chinese Clinical Notes
Pengfei Cao | Chenwei Yan | Xiangling Fu | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu | Weifeng Chong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

In this paper, we introduce Clinical-Coder, an online system aiming to assign ICD codes to Chinese clinical notes. ICD coding has been a research hotspot of clinical medicine, but the interpretability of prediction hinders its practical application. We exploit a Dilated Convolutional Attention network with N-gram Matching mechanism (DCANM) to capture semantic features for non-continuous words and continuous n-gram words, concentrating on explaining the reason why each ICD code to be predicted. The experiments demonstrate that our approach is effective and that our system is able to provide supporting information in clinical decision making.

pdf
Event Extraction as Machine Reading Comprehension
Jian Liu | Yubo Chen | Kang Liu | Wei Bi | Xiaojiang Liu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Event extraction (EE) is a crucial information extraction task that aims to extract event information in texts. Previous methods for EE typically model it as a classification task, which are usually prone to the data scarcity problem. In this paper, we propose a new learning paradigm of EE, by explicitly casting it as a machine reading comprehension problem (MRC). Our approach includes an unsupervised question generation process, which can transfer event schema into a set of natural questions, followed by a BERT-based question-answering process to retrieve answers as EE results. This learning paradigm enables us to strengthen the reasoning process of EE, by introducing sophisticated models in MRC, and relieve the data scarcity problem, by introducing the large-scale datasets in MRC. The empirical results show that: i) our approach attains state-of-the-art performance by considerable margins over previous methods. ii) Our model is excelled in the data-scarce scenario, for example, obtaining 49.8% in F1 for event argument extraction with only 1% data, compared with 2.2% of the previous method. iii) Our model also fits with zero-shot scenarios, achieving 37.0% and 16% in F1 on two datasets without using any EE training data.

pdf
Scene Restoring for Narrative Machine Reading Comprehension
Zhixing Tian | Yuanzhe Zhang | Kang Liu | Jun Zhao | Yantao Jia | Zhicheng Sheng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper focuses on machine reading comprehension for narrative passages. Narrative passages usually describe a chain of events. When reading this kind of passage, humans tend to restore a scene according to the text with their prior knowledge, which helps them understand the passage comprehensively. Inspired by this behavior of humans, we propose a method to let the machine imagine a scene during reading narrative for better comprehension. Specifically, we build a scene graph by utilizing Atomic as the external knowledge and propose a novel Graph Dimensional-Iteration Network (GDIN) to encode the graph. We conduct experiments on the ROCStories, a dataset of Story Cloze Test (SCT), and CosmosQA, a dataset of multiple choice. Our method achieves state-of-the-art.

pdf
How Does Context Matter? On the Robustness of Event Detection with Context-Selective Mask Generalization
Jian Liu | Yubo Chen | Kang Liu | Yantao Jia | Zhicheng Sheng
Findings of the Association for Computational Linguistics: EMNLP 2020

Event detection (ED) aims to identify and classify event triggers in texts, which is a crucial subtask of event extraction (EE). Despite many advances in ED, the existing studies are typically centered on improving the overall performance of an ED model, which rarely consider the robustness of an ED model. This paper aims to fill this research gap by stressing the importance of robustness modeling in ED models. We first pinpoint three stark cases demonstrating the brittleness of the existing ED models. After analyzing the underlying reason, we propose a new training mechanism, called context-selective mask generalization for ED, which can effectively mine context-specific patterns for learning and robustify an ED model. The experimental results have confirmed the effectiveness of our model regarding defending against adversarial attacks, exploring unseen predicates, and tackling ambiguity cases. Moreover, a deeper analysis suggests that our approach can learn a complementary predictive bias with most ED models that use full context for feature learning.

2019

pdf
Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning
Xiangrong Zeng | Shizhu He | Daojian Zeng | Kang Liu | Shengping Liu | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The multiple relation extraction task tries to extract all relational facts from a sentence. Existing works didn’t consider the extraction order of relational facts in a sentence. In this paper we argue that the extraction order is important in this task. To take the extraction order into consideration, we apply the reinforcement learning into a sequence-to-sequence model. The proposed model could generate relational facts freely. Widely conducted experiments on two public datasets demonstrate the efficacy of the proposed method.

pdf
Neural Cross-Lingual Event Detection with Minimal Parallel Resources
Jian Liu | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.

pdf
Generating Questions for Knowledge Bases via Incorporating Diversified Contexts and Answer-Aware Loss
Cao Liu | Kang Liu | Shizhu He | Zaiqing Nie | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We tackle the task of question generation over knowledge bases. Conventional methods for this task neglect two crucial research issues: 1) the given predicate needs to be expressed; 2) the answer to the generated question needs to be definitive. In this paper, we strive toward the above two issues via incorporating diversified contexts and answer-aware loss. Specifically, we propose a neural encoder-decoder model with multi-level copy mechanisms to generate such questions. Furthermore, the answer aware loss is introduced to make generated questions corresponding to more definitive answers. Experiments demonstrate that our model achieves state-of-the-art performance. Meanwhile, such generated question is able to express the given predicate and correspond to a definitive answer.

pdf
Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network
Dianbo Sui | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The lack of word boundaries information has been seen as one of the main obstacles to develop a high performance Chinese named entity recognition (NER) system. Fortunately, the automatically constructed lexicon contains rich word boundaries information and word semantic information. However, integrating lexical knowledge in Chinese NER tasks still faces challenges when it comes to self-matched lexical words as well as the nearest contextual lexical words. We present a Collaborative Graph Network to solve these challenges. Experiments on various datasets show that our model not only outperforms the state-of-the-art (SOTA) results, but also achieves a speed that is six to fifteen times faster than that of the SOTA model.

pdf
Machine Reading Comprehension Using Structural Knowledge Graph-aware Network
Delai Qiu | Yuanzhe Zhang | Xinwei Feng | Xiangwen Liao | Wenbin Jiang | Yajuan Lyu | Kang Liu | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Leveraging external knowledge is an emerging trend in machine comprehension task. Previous work usually utilizes knowledge graphs such as ConceptNet as external knowledge, and extracts triples from them to enhance the initial representation of the machine comprehension context. However, such method cannot capture the structural information in the knowledge graph. To this end, we propose a Structural Knowledge Graph-aware Network(SKG) model, constructing sub-graphs for entities in the machine comprehension context. Our method dynamically updates the representation of the knowledge according to the structural information of the constructed sub-graph. Experiments show that SKG achieves state-of-the-art performance on the ReCoRD dataset.

pdf
Incorporating Interlocutor-Aware Context into Response Generation on Multi-Party Chatbots
Cao Liu | Kang Liu | Shizhu He | Zaiqing Nie | Jun Zhao
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Conventional chatbots focus on two-party response generation, which simplifies the real dialogue scene. In this paper, we strive toward a novel task of Response Generation on Multi-Party Chatbot (RGMPC), where the generated responses heavily rely on the interlocutors’ roles (e.g., speaker and addressee) and their utterances. Unfortunately, complex interactions among the interlocutors’ roles make it challenging to precisely capture conversational contexts and interlocutors’ information. Facing this challenge, we present a response generation model which incorporates Interlocutor-aware Contexts into Recurrent Encoder-Decoder frameworks (ICRED) for RGMPC. Specifically, we employ interactive representations to capture dialogue contexts for different interlocutors. Moreover, we leverage an addressee memory to enhance contextual interlocutor information for the target addressee. Finally, we construct a corpus for RGMPC based on an existing open-access dataset. Automatic and manual evaluations demonstrate that the ICRED remarkably outperforms strong baselines.

pdf
Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation
Cao Liu | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We study the task of response generation. Conventional methods employ a fixed vocabulary and one-pass decoding, which not only make them prone to safe and general responses but also lack further refining to the first generated raw sequence. To tackle the above two problems, we present a Vocabulary Pyramid Network (VPN) which is able to incorporate multi-pass encoding and decoding with multi-level vocabularies into response generation. Specifically, the dialogue input and output are represented by multi-level vocabularies which are obtained from hierarchical clustering of raw words. Then, multi-pass encoding and decoding are conducted on the multi-level vocabularies. Since VPN is able to leverage rich encoding and decoding information with multi-level vocabularies, it has the potential to generate better responses. Experiments on English Twitter and Chinese Weibo datasets demonstrate that VPN remarkably outperforms strong baselines.

pdf
AdaNSP: Uncertainty-driven Adaptive Decoding in Neural Semantic Parsing
Xiang Zhang | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural semantic parsers utilize the encoder-decoder framework to learn an end-to-end model for semantic parsing that transduces a natural language sentence to the formal semantic representation. To keep the model aware of the underlying grammar in target sequences, many constrained decoders were devised in a multi-stage paradigm, which decode to the sketches or abstract syntax trees first, and then decode to target semantic tokens. We instead to propose an adaptive decoding method to avoid such intermediate representations. The decoder is guided by model uncertainty and automatically uses deeper computations when necessary. Thus it can predict tokens adaptively. Our model outperforms the state-of-the-art neural models and does not need any expertise like predefined grammar or sketches in the meantime.

2018

pdf
Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism
Xiangrong Zeng | Daojian Zeng | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The relational facts in sentences are often complicated. Different relational triplets may have overlaps in a sentence. We divided the sentences into three types according to triplet overlap degree, including Normal, EntityPairOverlap and SingleEntiyOverlap. Existing methods mainly focus on Normal class and fail to extract relational triplets precisely. In this paper, we propose an end-to-end model based on sequence-to-sequence learning with copy mechanism, which can jointly extract relational facts from sentences of any of these classes. We adopt two different strategies in decoding process: employing only one united decoder or applying multiple separated decoders. We test our models in two public datasets and our model outperform the baseline method significantly.

pdf
DCFEE: A Document-level Chinese Financial Event Extraction System based on Automatically Labeled Training Data
Hang Yang | Yubo Chen | Kang Liu | Yang Xiao | Jun Zhao
Proceedings of ACL 2018, System Demonstrations

We present an event extraction framework to detect event mentions and extract events from the document-level financial news. Up to now, methods based on supervised learning paradigm gain the highest performance in public datasets (such as ACE2005, KBP2015). These methods heavily depend on the manually labeled training data. However, in particular areas, such as financial, medical and judicial domains, there is no enough labeled data due to the high cost of data labeling process. Moreover, most of the current methods focus on extracting events from one sentence, but an event is usually expressed by multiple sentences in one document. To solve these problems, we propose a Document-level Chinese Financial Event Extraction (DCFEE) system which can automatically generate a large scaled labeled data and extract events from the whole document. Experimental results demonstrate the effectiveness of it

pdf
Pattern-revising Enhanced Simple Question Answering over Knowledge Bases
Yanchao Hao | Hao Liu | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 27th International Conference on Computational Linguistics

Question Answering over Knowledge Bases (KB-QA), which automatically answer natural language questions based on the facts contained by a knowledge base, is one of the most important natural language processing (NLP) tasks. Simple questions constitute a large part of questions queried on the web, still being a challenge to QA systems. In this work, we propose to conduct pattern extraction and entity linking first, and put forward pattern revising procedure to mitigate the error propagation problem. In order to learn to rank candidate subject-predicate pairs to enable the relevant facts retrieval given a question, we propose to do joint fact selection enhanced by relation detection. Multi-level encodings and multi-dimension information are leveraged to strengthen the whole procedure. The experimental results demonstrate that our approach sets a new record in this task, outperforming the current state-of-the-art by an absolute large margin.

pdf
Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao | Shengping Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Named entity recognition (NER) is an important task in natural language processing area, which needs to determine entities boundaries and classify them into pre-defined categories. For Chinese NER task, there is only a very small amount of annotated data available. Chinese NER task and Chinese word segmentation (CWS) task have many similar word boundaries. There are also specificities in each task. However, existing methods for Chinese NER either do not exploit word boundary information from CWS or cannot filter the specific information of CWS. In this paper, we propose a novel adversarial transfer learning framework to make full use of task-shared boundaries information and prevent the task-specific features of CWS. Besides, since arbitrary character can provide important cues when predicting entity type, we exploit self-attention to explicitly capture long range dependencies between two tokens. Experimental results on two different widely used datasets show that our proposed model significantly and consistently outperforms other state-of-the-art methods.

pdf
Collective Event Detection via a Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms
Yubo Chen | Hang Yang | Kang Liu | Jun Zhao | Yantao Jia
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Traditional approaches to the task of ACE event detection primarily regard multiple events in one sentence as independent ones and recognize them separately by using sentence-level information. However, events in one sentence are usually interdependent and sentence-level information is often insufficient to resolve ambiguities for some types of events. This paper proposes a novel framework dubbed as Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms (HBTNGMA) to solve the two problems simultaneously. Firstly, we propose a hierachical and bias tagging networks to detect multiple events in one sentence collectively. Then, we devise a gated multi-level attention to automatically extract and dynamically fuse the sentence-level and document-level information. The experimental results on the widely used ACE 2005 dataset show that our approach significantly outperforms other state-of-the-art methods.

2017

pdf
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning
Shizhu He | Cao Liu | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Generating answer with natural language sentence is very important in real-world question answering systems, which needs to obtain a right answer as well as a coherent natural response. In this paper, we propose an end-to-end question answering system called COREQA in sequence-to-sequence learning, which incorporates copying and retrieving mechanisms to generate natural answers within an encoder-decoder framework. Specifically, in COREQA, the semantic units (words, phrases and entities) in a natural answer are dynamically predicted from the vocabulary, copied from the given question and/or retrieved from the corresponding knowledge base jointly. Our empirical study on both synthetic and real-world datasets demonstrates the efficiency of COREQA, which is able to generate correct, coherent and natural answers for knowledge inquired questions.

pdf
An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge
Yanchao Hao | Yuanzhe Zhang | Kang Liu | Shizhu He | Zhanyi Liu | Hua Wu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

With the rapid growth of knowledge bases (KBs) on the web, how to take full advantage of them becomes increasingly important. Question answering over knowledge base (KB-QA) is one of the promising approaches to access the substantial knowledge. Meanwhile, as the neural network-based (NN-based) methods develop, NN-based KB-QA has already achieved impressive results. However, previous work did not put more emphasis on question representation, and the question is converted into a fixed vector regardless of its candidate answers. This simple representation strategy is not easy to express the proper information in the question. Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism. In addition, we leverage the global knowledge inside the underlying KB, aiming at integrating the rich KB information into the representation of the answers. As a result, it could alleviates the out-of-vocabulary (OOV) problem, which helps the cross-attention model to represent the question more precisely. The experimental results on WebQuestions demonstrate the effectiveness of the proposed approach.

pdf
Handling Cold-Start Problem in Review Spam Detection by Jointly Embedding Texts and Behaviors
Xuepeng Wang | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Solving cold-start problem in review spam detection is an urgent and significant task. It can help the on-line review websites to relieve the damage of spammers in time, but has never been investigated by previous work. This paper proposes a novel neural network model to detect review spam for cold-start problem, by learning to represent the new reviewers’ review with jointly embedded textual and behavioral information. Experimental results prove the proposed model achieves an effective performance and possesses preferable domain-adaptability. It is also applicable to a large scale dataset in an unsupervised way.

pdf
Automatically Labeled Data Generation for Large Scale Event Extraction
Yubo Chen | Shulin Liu | Xiang Zhang | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Modern models of event extraction for tasks like ACE are based on supervised learning of events from small hand-labeled data. However, hand-labeled training data is expensive to produce, in low coverage of event types, and limited in size, which makes supervised methods hard to extract large scale of events for knowledge base population. To solve the data labeling problem, we propose to automatically label training data for event extraction via world knowledge and linguistic knowledge, which can detect key arguments and trigger words for each event type and employ them to label events in texts automatically. The experimental results show that the quality of our large scale automatically labeled data is competitive with elaborately human-labeled data. And our automatically labeled data can incorporate with human-labeled data, then improve the performance of models learned from these data.

pdf
Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms
Shulin Liu | Yubo Chen | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper tackles the task of event detection (ED), which involves identifying and categorizing events. We argue that arguments provide significant clues to this task, but they are either completely ignored or exploited in an indirect manner in existing detection approaches. In this work, we propose to exploit argument information explicitly for ED via supervised attention mechanisms. In specific, we systematically investigate the proposed model under the supervision of different attention strategies. Experimental results show that our approach advances state-of-the-arts and achieves the best F1 score on ACE 2005 dataset.

pdf
Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?
Shangmin Guo | Xiangrong Zeng | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

As one of the most important test of China, Gaokao is designed to be difficult enough to distinguish the excellent high school students. In this work, we detailed the Gaokao History Multiple Choice Questions(GKHMC) and proposed two different approaches to address them using various resources. One approach is based on entity search technique (IR approach), the other is based on text entailment approach where we specifically employ deep neural networks(NN approach). The result of experiment on our collected real Gaokao questions showed that they are good at different categories of questions, that is IR approach performs much better at entity questions(EQs) while NN approach shows its advantage on sentence questions(SQs). We achieve state-of-the-art performance and show that it’s indispensable to apply hybrid method when participating in the real-world tests.

pdf
IJCNLP-2017 Task 5: Multi-choice Question Answering in Examinations
Shangmin Guo | Kang Liu | Shizhu He | Cao Liu | Jun Zhao | Zhuoyu Wei
Proceedings of the IJCNLP 2017, Shared Tasks

The IJCNLP-2017 Multi-choice Question Answering(MCQA) task aims at exploring the performance of current Question Answering(QA) techniques via the realworld complex questions collected from Chinese Senior High School Entrance Examination papers and CK12 website1. The questions are all 4-way multi-choice questions writing in Chinese and English respectively that cover a wide range of subjects, e.g. Biology, History, Life Science and etc. And, all questions are restrained within the elementary and middle school level. During the whole procedure of this task, 7 teams submitted 323 runs in total. This paper describes the collected data, the format and size of these questions, formal run statistics and results, overview and performance statistics of different methods

2016

pdf
Learning to Represent Review with Tensor Decomposition for Spam Detection
Xuepeng Wang | Kang Liu | Shizhu He | Jun Zhao
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Mining Inference Formulas by Goal-Directed Random Walks
Zhuoyu Wei | Jun Zhao | Kang Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Inner Attention based Recurrent Neural Networks for Answer Selection
Bingning Wang | Kang Liu | Jun Zhao
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Leveraging FrameNet to Improve Automatic Event Detection
Shulin Liu | Yubo Chen | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Book Review: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions by Bing Liu
Jun Zhao | Kang Liu | Liheng Xu
Computational Linguistics, Volume 42, Issue 3 - September 2016

2015

pdf
Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks
Yubo Chen | Liheng Xu | Kang Liu | Daojian Zeng | Jun Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
Sentiment-Aspect Extraction based on Restricted Boltzmann Machines
Linlin Wang | Kang Liu | Zhu Cao | Jun Zhao | Gerard de Melo
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
Knowledge Graph Embedding via Dynamic Mapping Matrix
Guoliang Ji | Shizhu He | Liheng Xu | Kang Liu | Jun Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks
Daojian Zeng | Kang Liu | Yubo Chen | Jun Zhao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf
Question Answering over Linked Data Using First-order Logic
Shizhu He | Kang Liu | Yuanzhe Zhang | Liheng Xu | Jun Zhao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Extracting Opinion Targets and Opinion Words from Online Reviews with Graph Co-ranking
Kang Liu | Liheng Xu | Jun Zhao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Product Feature Mining: Semantic Clues versus Syntactic Constituents
Liheng Xu | Kang Liu | Siwei Lai | Jun Zhao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Joint Opinion Relation Detection Using One-Class Deep Neural Network
Liheng Xu | Kang Liu | Jun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf
Exploring Fine-grained Entity Type Constraints for Distantly Supervised Relation Extraction
Yang Liu | Kang Liu | Liheng Xu | Jun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf
Relation Classification via Convolutional Deep Neural Network
Daojian Zeng | Kang Liu | Siwei Lai | Guangyou Zhou | Jun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf
A Weakly Supervised Bayesian Model for Violence Detection in Social Media
Amparo Elizabeth Cano Basave | Yulan He | Kang Liu | Jun Zhao
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Attribute Relation Extraction from Template-inconsistent Semi-structured Text by Leveraging Site-level Knowledge
Yang Liu | Fang Liu | Siwei Lai | Kang Liu | Guangyou Zhou | Jun Zhao
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews
Kang Liu | Liheng Xu | Jun Zhao
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Mining Opinion Words and Opinion Targets in a Two-Stage Framework
Liheng Xu | Kang Liu | Siwei Lai | Yubo Chen | Jun Zhao
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf
Opinion Target Extraction Using Word-Based Translation Model
Kang Liu | Liheng Xu | Jun Zhao
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf
Exploiting Bilingual Translation for Question Retrieval in Community-Based Question Answering
Guangyou Zhou | Kang Liu | Jun Zhao
Proceedings of COLING 2012

2011

pdf
Improving Dependency Parsing with Fined-Grained Features
Guangyou Zhou | Li Cai | Kang Liu | Jun Zhao
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Learning the Latent Topics for Question Retrieval in Community QA
Li Cai | Guangyou Zhou | Kang Liu | Jun Zhao
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives
Guangyou Zhou | Li Cai | Jun Zhao | Kang Liu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing
Guangyou Zhou | Jun Zhao | Kang Liu | Li Cai
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2009

pdf
A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
Fan Yang | Jun Zhao | Kang Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages
Fan Yang | Jun Zhao | Bo Zou | Kang Liu | Feifan Liu
Proceedings of ACL-08: HLT

pdf
Adding Redundant Features for CRFs-based Sentence Sentiment Classification
Jun Zhao | Kang Liu | Gen Wang
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

Search
Co-authors