Dai Dai


Unified Structure Generation for Universal Information Extraction
Yaojie Lu | Qing Liu | Dai Dai | Xinyan Xiao | Hongyu Lin | Xianpei Han | Le Sun | Hua Wu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Information extraction suffers from its varying targets, heterogeneous structures, and demand-specific schemas. In this paper, we propose a unified text-to-structure generation framework, namely UIE, which can universally model different IE tasks, adaptively generate targeted structures, and collaboratively learn general IE abilities from different knowledge sources. Specifically, UIE uniformly encodes different extraction structures via a structured extraction language, adaptively generates target extractions via a schema-based prompt mechanism – structural schema instructor, and captures the common IE abilities via a large-scale pretrained text-to-structure model. Experiments show that UIE achieved the state-of-the-art performance on 4 IE tasks, 13 datasets, and on all supervised, low-resource, and few-shot settings for a wide range of entity, relation, event and sentiment extraction tasks and their unification. These results verified the effectiveness, universality, and transferability of UIE.

Learn and Review: Enhancing Continual Named Entity Recognition via Reviewing Synthetic Samples
Yu Xia | Quan Wang | Yajuan Lyu | Yong Zhu | Wenhao Wu | Sujian Li | Dai Dai
Findings of the Association for Computational Linguistics: ACL 2022

Traditional methods for named entity recognition (NER) classify mentions into a fixed set of pre-defined entity types. However, in many real-world scenarios, new entity types are incrementally involved. To investigate this problem, continual learning is introduced for NER. However, the existing method depends on the relevance between tasks and is prone to inter-type confusion.In this paper, we propose a novel two-stage framework Learn-and-Review (L&R) for continual NER under the type-incremental setting to alleviate the above issues.Specifically, for the learning stage, we distill the old knowledge from teacher to a student on the current dataset. For the reviewing stage, we first generate synthetic samples of old types to augment the dataset. Then, we further distill new knowledge from the above student and old knowledge from the teacher to get an enhanced student on the augmented dataset. This stage has the following advantages: (1) The synthetic samples mitigate the gap between the old and new task and thus enhance the further distillation; (2) Different types of entities are jointly seen during training which alleviates the inter-type confusion. Experimental results show that L&R outperforms the state-of-the-art method on CoNLL-03 and OntoNotes-5.0.


From Learning-to-Match to Learning-to-Discriminate:Global Prototype Learning for Few-shot Relation Classification
Liu Fangchao | Xiao Xinyan | Yan Lingyong | Lin Hongyu | Han Xianpei | Dai Dai | Wu Hua | Sun Le
Proceedings of the 20th Chinese National Conference on Computational Linguistics

Few-shot relation classification has attracted great attention recently and is regarded as an ef-fective way to tackle the long-tail problem in relation classification. Most previous works onfew-shot relation classification are based on learning-to-match paradigms which focus on learn-ing an effective universal matcher between the query and one target class prototype based oninner-class support sets. However the learning-to-match paradigm focuses on capturing the sim-ilarity knowledge between query and class prototype while fails to consider discriminative infor-mation between different candidate classes. Such information is critical especially when targetclasses are highly confusing and domain shifting exists between training and testing phases. Inthis paper we propose the Global Transformed Prototypical Networks(GTPN) which learns tobuild a few-shot model to directly discriminate between the query and all target classes with bothinner-class local information and inter-class global information. Such learning-to-discriminate paradigm can make the model concentrate more on the discriminative knowledge between allcandidate classes and therefore leads to better classification performance. We conducted exper-iments on standard FewRel benchmarks. Experimental results show that GTPN achieves very competitive performance on few-shot relation classification and reached the best performance onthe official leaderboard of FewRel 2.0 1.


ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification
Wei Jia | Dai Dai | Xinyan Xiao | Hua Wu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Distant supervision is widely used in relation classification in order to create large-scale training data by aligning a knowledge base with an unlabeled corpus. However, it also introduces amounts of noisy labels where a contextual sentence actually does not express the labeled relation. In this paper, we propose ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification. ARNOR assumes that a trustable relation label should be explained by the neural attention model. Specifically, our ARNOR framework iteratively learns an interpretable model and utilizes it to select trustable instances. We first introduce attention regularization to force the model to pay attention to the patterns which explain the relation labels, so as to make the model more interpretable. Then, if the learned model can clearly locate the relation patterns of a candidate instance in the training set, we will select it as a trustable instance for further training step. According to the experiments on NYT data, our ARNOR framework achieves significant improvements over state-of-the-art methods in both relation classification performance and noise reduction effect.