Haiyun Jiang


2023

pdf
Unsupervised Keyphrase Extraction by Learning Neural Keyphrase Set Function
Mingyang Song | Haiyun Jiang | Lemao Liu | Shuming Shi | Liping Jing
Findings of the Association for Computational Linguistics: ACL 2023

We create a paradigm shift concerning building unsupervised keyphrase extraction systems in this paper. Instead of modeling the relevance between an individual candidate phrase and the document as in the commonly used framework, we formulate the unsupervised keyphrase extraction task as a document-set matching problem from a set-wise perspective, in which the document and the candidate set are globally matched in the semantic space to particularly take into account the interactions among all candidate phrases. Since it is intractable to exactly extract the keyphrase set by the matching function during the inference, we propose an approximate approach, which obtains the candidate subsets via a set extractor agent learned by reinforcement learning. Exhaustive experimental results demonstrate the effectiveness of our model, which outperforms the recent state-of-the-art unsupervised keyphrase extraction baselines by a large margin.

pdf
Making Better Use of Training Corpus: Retrieval-based Aspect Sentiment Triplet Extraction via Label Interpolation
Guoxin Yu | Lemao Liu | Haiyun Jiang | Shuming Shi | Xiang Ao
Findings of the Association for Computational Linguistics: ACL 2023

In this paper, we aim to adapt the idea of retrieval-based neural approaches to the Aspect Sentiment Triplet Extraction (ASTE) task. Different from previous studies retrieving semantic similar neighbors, the ASTE task has its specialized challenges when adapting, i.e., the purpose includes predicting the sentiment polarity and it is usually aspect-dependent. Semantic similar neighbors with different polarities will be infeasible even counterproductive. To tackle this issue, we propose a retrieval-based neural ASTE approach, named RLI (Retrieval-based Aspect Sentiment Triplet Extraction via Label Interpolation), which exploits the label information of neighbors. Given an aspect-opinion term pair, we retrieve semantic similar triplets from the training corpus and interpolate their label information into the augmented representation of the target pair. The retriever is jointly trained with the whole ASTE framework, and neighbors with both similar semantics and sentiments can be recalled with the aid of this distant supervision. In addition, we design a simple yet effective pre-train method for the retriever that implicitly encodes the label similarities. Extensive experiments and analysis on two widely-used benchmarks show that the proposed model establishes a new state-of-the-art on ASTE.

pdf
Retrieval-Augmented Few-shot Text Classification
Guoxin Yu | Lemao Liu | Haiyun Jiang | Shuming Shi | Xiang Ao
Findings of the Association for Computational Linguistics: EMNLP 2023

Retrieval-augmented methods are successful in the standard scenario where the retrieval space is sufficient; whereas in the few-shot scenario with limited retrieval space, this paper shows it is non-trivial to put them into practice. First, it is impossible to retrieve semantically similar examples by using an off-the-shelf metric and it is crucial to learn a task-specific retrieval metric; Second, our preliminary experiments demonstrate that it is difficult to optimize a plausible metric by minimizing the standard cross-entropy loss. The in-depth analyses quantitatively show minimizing cross-entropy loss suffers from the weak supervision signals and the severe gradient vanishing issue during the optimization. To address these issues, we introduce two novel training objectives, namely EM-L and R-L, which provide more task-specific guidance to the retrieval metric by the EM algorithm and a ranking-based loss, respectively. Extensive experiments on 10 datasets prove the superiority of the proposed retrieval augmented methods on the performance.

pdf
Large Language Models Meet Harry Potter: A Dataset for Aligning Dialogue Agents with Characters
Nuo Chen | Yan Wang | Haiyun Jiang | Deng Cai | Yuhan Li | Ziyang Chen | Longyue Wang | Jia Li
Findings of the Association for Computational Linguistics: EMNLP 2023

In recent years, Dialogue-style Large Language Models (LLMs) such as ChatGPT and GPT4 have demonstrated immense potential in constructing open-domain dialogue agents. However, aligning these agents with specific characters or individuals remains a considerable challenge due to the complexities of character representation and the lack of comprehensive annotations. In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment. The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series and is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes. These extensive annotations may empower LLMs to unlock character-driven dialogue capabilities. Furthermore, it can serve as a universal benchmark for evaluating how well can a LLM aligning with a specific character. We benchmark LLMs on HPD using both fine-tuning and in-context learning settings. Evaluation results reveal that although there is substantial room for improvement in generating high-quality, character-aligned responses, the proposed dataset is valuable in guiding models toward responses that better align with the character of Harry Potter.

pdf
Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study
Yi Chen | Rui Wang | Haiyun Jiang | Shuming Shi | Ruifeng Xu
Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)

pdf
Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model
Lingfeng Shen | Haiyun Jiang | Lemao Liu | Shuming Shi
Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023)

pdf
Effidit: An Assistant for Improving Writing Efficiency
Shuming Shi | Enbo Zhao | Wei Bi | Deng Cai | Leyang Cui | Xinting Huang | Haiyun Jiang | Duyu Tang | Kaiqiang Song | Longyue Wang | Chenyan Huang | Guoping Huang | Yan Wang | Piji Li
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Writing assistants are valuable tools that can help writers improve their writing skills. We introduce Effidit (Efficient and Intelligent Editing), a digital writing assistant that facilitates users to write higher-quality text more efficiently through the use of Artificial Intelligence (AI) and Natural Language Processing (NLP) technologies. We significantly expand the capacities of a writing assistantby providing functions in three modules: text completion, hint recommendation, and writing refinement. Based on the above efforts, Effidit can efficiently assist users in creating their own text. Effidit has been deployed to several Tencent products and publicly released at https://effidit.qq.com/.

2022

pdf
MCPG: A Flexible Multi-Level Controllable Framework for Unsupervised Paraphrase Generation
Yi Chen | Haiyun Jiang | Lemao Liu | Rui Wang | Shuming Shi | Ruifeng Xu
Findings of the Association for Computational Linguistics: EMNLP 2022

We present MCPG: a simple and effectiveapproach for controllable unsupervised paraphrase generation, which is also flexible toadapt to specific domains without extra training. MCPG is controllable in different levels: local lexicons, global semantics, and universal styles. The unsupervised paradigm ofMCPG combines factual keywords and diversified semantic embeddings as local lexical andglobal semantic constraints. The semantic embeddings are diversified by standard dropout,which is exploited for the first time to increaseinference diversity by us. Moreover, MCPGis qualified with good domain adaptability byadding a transfer vector as a universal style constraint, which is refined from the exemplars retrieved from the corpus of the target domain in atraining-free way. Extensive experiments showthat MCPG outperforms state-of-the-art unsupervised baselines by a margin. Meanwhile,our domain-adapted MCPG also achieves competitive performance with strong supervisedbaselines even without training.

pdf
Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing
Yi Chen | Jiayang Cheng | Haiyun Jiang | Lemao Liu | Haisong Zhang | Shuming Shi | Ruifeng Xu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we firstly empirically find that existing models struggle to handle hard mentions due to their insufficient contexts, which consequently limits their overall typing performance. To this end, we propose to exploit sibling mentions for enhancing the mention representations. Specifically, we present two different metrics for sibling selection and employ an attentive graph neural network to aggregate information from sibling mentions. The proposed graph model is scalable in that unseen test mentions are allowed to be added as new nodes for inference. Exhaustive experiments demonstrate the effectiveness of our sibling learning strategy, where our model outperforms ten strong baselines. Moreover, our experiments indeed prove the superiority of sibling mentions in helping clarify the types for hard mentions.

pdf
On the Evaluation Metrics for Paraphrase Generation
Lingfeng Shen | Lemao Liu | Haiyun Jiang | Shuming Shi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts. (2) Most commonly used metrics do not align well with human annotation. Underlying reasons behind the above findings are explored through additional experiments and in-depth analyses. Based on the experiments and analyses, we propose ParaScore, a new evaluation metric for paraphrase generation. It possesses the merits of reference-based and reference-free metrics and explicitly models lexical divergence. Based on our analysis and improvements, our proposed reference-based outperforms than reference-free metrics. Experimental results demonstrate that ParaScore significantly outperforms existing metrics.

2021

pdf
An Empirical Study on Multiple Information Sources for Zero-Shot Fine-Grained Entity Typing
Yi Chen | Haiyun Jiang | Lemao Liu | Shuming Shi | Chuang Fan | Min Yang | Ruifeng Xu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Auxiliary information from multiple sources has been demonstrated to be effective in zero-shot fine-grained entity typing (ZFET). However, there lacks a comprehensive understanding about how to make better use of the existing information sources and how they affect the performance of ZFET. In this paper, we empirically study three kinds of auxiliary information: context consistency, type hierarchy and background knowledge (e.g., prototypes and descriptions) of types, and propose a multi-source fusion model (MSF) targeting these sources. The performance obtains up to 11.42% and 22.84% absolute gains over state-of-the-art baselines on BBN and Wiki respectively with regard to macro F1 scores. More importantly, we further discuss the characteristics, merits and demerits of each information source and provide an intuitive understanding of the complementarity among them.

pdf
Fine-grained Entity Typing without Knowledge Base
Jing Qian | Yibin Liu | Lemao Liu | Yangming Li | Haiyun Jiang | Haisong Zhang | Shuming Shi
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Existing work on Fine-grained Entity Typing (FET) typically trains automatic models on the datasets obtained by using Knowledge Bases (KB) as distant supervision. However, the reliance on KB means this training setting can be hampered by the lack of or the incompleteness of the KB. To alleviate this limitation, we propose a novel setting for training FET models: FET without accessing any knowledge base. Under this setting, we propose a two-step framework to train FET models. In the first step, we automatically create pseudo data with fine-grained labels from a large unlabeled dataset. Then a neural network model is trained based on the pseudo data, either in an unsupervised way or using self-training under the weak guidance from a coarse-grained Named Entity Recognition (NER) model. Experimental results show that our method achieves competitive performance with respect to the models trained on the original KB-supervised datasets.

pdf bib
TexSmart: A System for Enhanced Natural Language Understanding
Lemao Liu | Haisong Zhang | Haiyun Jiang | Yangming Li | Enbo Zhao | Kun Xu | Linfeng Song | Suncong Zheng | Botong Zhou | Dick Zhu | Xiao Feng | Tao Chen | Tao Yang | Dong Yu | Feng Zhang | ZhanHui Kang | Shuming Shi
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

This paper introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities. Compared to most previous publicly available text understanding systems and tools, TexSmart holds some unique features. First, the NER function of TexSmart supports over 1,000 entity types, while most other public tools typically support several to (at most) dozens of entity types. Second, TexSmart introduces new semantic analysis functions like semantic expansion and deep semantic representation, that are absent in most previous systems. Third, a spectrum of algorithms (from very fast algorithms to those that are relatively slow but more accurate) are implemented for one function in TexSmart, to fulfill the requirements of different academic and industrial applications. The adoption of unsupervised or weakly-supervised algorithms is especially emphasized, with the goal of easily updating our models to include fresh data with less human annotation efforts.

2019

pdf
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation
Jiangjie Chen | Ao Wang | Haiyun Jiang | Suo Feng | Chenguang Li | Yanghua Xiao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

A type description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity. Entities in most knowledge graphs (KGs) still lack such descriptions, thus calling for automatic methods to supplement such information. However, existing generative methods either overlook the grammatical structure or make factual mistakes in generated texts. To solve these problems, we propose a head-modifier template based method to ensure the readability and data fidelity of generated type descriptions. We also propose a new dataset and two metrics for this task. Experiments show that our method improves substantially compared with baselines and achieves state-of-the-art performance on both datasets.