The objective of the Causal Emotion Entailment (CEE) task is to identify the causes of the target emotional utterances in a given conversation. Most existing studies have focused on a fine-tuning paradigm based on a pretrained model, e.g., the BERT model. However, there are gaps between the pretrained task and the CEE task. Although a pretrained model enhances contextual comprehension to some extent, it cannot acquire specific knowledge that is relevant to the CEE task. In addition, in a typical CEE task, there are peculiarities in the distribution of the positions with different emotion types of emotion utterances and cause utterances in conversations. Existing methods employ a fixed-size window to capture the relationship between neighboring conversations; however, these methods ignore the specific semantic associations between emotions and cause utterances. To address these issues, we propose the Position-oriented Prompt-tuning (POP-CEE) model to solve the CEE task in an end-to-end manner. Specifically, we can model the CEE task by designing prompts with multiple unified goals and by exploring the positional relationship between emotion and cause utterances using a position constraint module. Experimental results demonstrate that the proposed POP-CEE model achieves state-of-the-art performance on a benchmark dataset. Ourcode and data can be found at: https://github.com/Zh0uzh/POP-CEE.
Emotion-cause pair extraction (ECPE) main focus is on extracting all potential emotion clauses and corresponding cause clauses from unannotated documents. Existing methods achieve promising results with the help of fine-tuning and prompt paradigms, but they present three downsides. First, most approaches cannot distinguish between the emotion-cause pairs that belong to different types of emotions, limiting the existing approaches’ applicability. Second, existing prompt methods utilize a one-to-one mapping relation to achieve label words to category mapping, which brings considerable bias to the results. Third, existing methods achieve the cause extraction task supported by explicit semantic understanding or basic prompt templates, ignoring the implicit information contained in the cause clauses themselves. To solve these issues, we propose an Emotion knowledge-aware Prompt-tuning for Emotion-Cause Pair Extraction (EmoPrompt-ECPE) method, which integrate the knowledge of emotion categories in the ECPE task and mine the implicit knowledge of cause clauses. Specifically, we inject the latent knowledge of the cause clauses and the emotion types into the prompt template. Besides, we extend the emotion labels for many-to-one mapping of label words to categories with an external emotion word base. Furthermore, we utilize the cosine similarity filtering of the label word base to reduce the noise caused by knowledge introduction. Experiments on both Chinese and English benchmark datasets show that our approach can achieve state-of-the-art results. Our code and data can be found at: https://github.com/xy-xiaotudou/EmoPrompt-ECPE.
Learning high-quality dialogue representations is essential for solving a variety of dialogue-oriented tasks, especially considering that dialogue systems often suffer from data scarcity. In this paper, we introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks. DSE learns from dialogues by taking consecutive utterances of the same dialogue as positive pairs for contrastive learning. Despite its simplicity, DSE achieves significantly better representation capability than other dialogue representation and universal sentence representation models. We evaluate DSE on five downstream dialogue tasks that examine dialogue representation at different semantic granularities. Experiments in few-shot and zero-shot settings show that DSE outperforms baselines by a large margin, for example, it achieves 13% average performance improvement over the strongest unsupervised baseline in 1-shot intent classification on 6 datasets. We also provide analyses on the benefits and limitations of our model.
In this paper, we explore the slot tagging with only a few labeled support sentences (a.k.a. few-shot). Few-shot slot tagging faces a unique challenge compared to the other fewshot classification problems as it calls for modeling the dependencies between labels. But it is hard to apply previously learned label dependencies to an unseen domain, due to the discrepancy of label sets. To tackle this, we introduce a collapsed dependency transfer mechanism into the conditional random field (CRF) to transfer abstract label dependency patterns as transition scores. In the few-shot setting, the emission score of CRF can be calculated as a word’s similarity to the representation of each label. To calculate such similarity, we propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model – TapNet, by leveraging label name semantics in representing labels. Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.