Zixuan Ke


2022

pdf
Adapting a Language Model While Preserving its General Knowledge
Zixuan Ke | Yijia Shao | Haowei Lin | Hu Xu | Lei Shu | Bing Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Domain-adaptive pre-training (or DA-training for short), also known as post-training, aimsto train a pre-trained general-purpose language model (LM) using an unlabeled corpus of aparticular domain to adapt the LM so that end-tasks in the domain can give improved performances. However, existing DA-training methods are in some sense blind as they do not explicitly identify what knowledge in the LM should be preserved and what should be changed by the domain corpus. This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM by (1) soft-masking the attention heads based on their importance to best preserve the general knowledge in the LM and (2) contrasting the representations of the general and the full (both general and domain knowledge) to learn an integrated representation with both general and domain-specific knowledge. Experimental results will demonstrate the effectiveness of the proposed approach.

pdf
Continual Training of Language Models for Few-Shot Learning
Zixuan Ke | Haowei Lin | Yijia Shao | Hu Xu | Lei Shu | Bing Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications. Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills. The goal is to improve the few-shot end-task learning in these domains. The resulting system is called CPT (Continual PostTraining), which to our knowledge, is the first continual post-training system. Experimental results verify its effectiveness.

pdf
Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data
Zixuan Ke | Mohammad Kachuee | Sungjin Lee
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

In many real-world machine learning applications, samples belong to a set of domains e.g., for product reviews each review belongs to a product category. In this paper, we study multi-domain imbalanced learning (MIL), the scenario that there is imbalance not only in classes but also in domains. In the MIL setting, different domains exhibit different patterns and there is a varying degree of similarity and divergence among domains posing opportunities and challenges for transfer learning especially when faced with limited or insufficient training data.We propose a novel domain-aware contrastive knowledge transfer method called DCMI to (1) identify the shared domain knowledge to encourage positive transfer among similar domains (in particular from head domains to tail domains); (2) isolate the domain-specific knowledge to minimize the negative transfer from dissimilar domains. We evaluated the performance of DCMI on three different datasets showing significant improvements in different MIL scenarios.

2021

pdf
CLASSIC: Continual and Contrastive Learning of Aspect Sentiment Classification Tasks
Zixuan Ke | Bing Liu | Hu Xu | Lei Shu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

This paper studies continual learning (CL) of a sequence of aspect sentiment classification (ASC) tasks in a particular CL setting called domain incremental learning (DIL). Each task is from a different domain or product. The DIL setting is particularly suited to ASC because in testing the system needs not know the task/domain to which the test data belongs. To our knowledge, this setting has not been studied before for ASC. This paper proposes a novel model called CLASSIC. The key novelty is a contrastive continual learning method that enables both knowledge transfer across tasks and knowledge distillation from old tasks to the new task, which eliminates the need for task ids in testing. Experimental results show the high effectiveness of CLASSIC.

pdf
Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks
Zixuan Ke | Hu Xu | Bing Liu
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

This paper studies continual learning (CL) of a sequence of aspect sentiment classification (ASC) tasks. Although some CL techniques have been proposed for document sentiment classification, we are not aware of any CL work on ASC. A CL system that incrementally learns a sequence of ASC tasks should address the following two issues: (1) transfer knowledge learned from previous tasks to the new task to help it learn a better model, and (2) maintain the performance of the models for previous tasks so that they are not forgotten. This paper proposes a novel capsule network based model called B-CL to address these issues. B-CL markedly improves the ASC performance on both the new task and the old tasks via forward and backward knowledge transfer. The effectiveness of B-CL is demonstrated through extensive experiments.

2019

pdf
Give Me More Feedback II: Annotating Thesis Strength and Related Attributes in Student Essays
Zixuan Ke | Hrishikesh Inamdar | Hui Lin | Vincent Ng
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

While the vast majority of existing work on automated essay scoring has focused on holistic scoring, researchers have recently begun work on scoring specific dimensions of essay quality. Nevertheless, progress on dimension-specific essay scoring is limited in part by the lack of annotated corpora. To facilitate advances in this area, we design a scoring rubric for scoring a core, yet unexplored dimension of persuasive essay quality, thesis strength, and annotate a corpus of essays with thesis strength scores. We additionally identify the attributes that could impact thesis strength and annotate the essays with the values of these attributes, which, when predicted by computational models, could provide further feedback to students on why her essay receives a particular thesis strength score.

2018

pdf
Give Me More Feedback: Annotating Argument Persuasiveness and Related Attributes in Student Essays
Winston Carlile | Nishant Gurrapadi | Zixuan Ke | Vincent Ng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While argument persuasiveness is one of the most important dimensions of argumentative essay quality, it is relatively little studied in automated essay scoring research. Progress on scoring argument persuasiveness is hindered in part by the scarcity of annotated corpora. We present the first corpus of essays that are simultaneously annotated with argument components, argument persuasiveness scores, and attributes of argument components that impact an argument’s persuasiveness. This corpus could trigger the development of novel computational models concerning argument persuasiveness that provide useful feedback to students on why their arguments are (un)persuasive in addition to how persuasive they are.