In this work, we study the language model backbone replacement problem for personalized downstream tasks in a non-stationary on-device scenario. In real world, company may periodically update the knowledge and architectures of backbones to keep the competitive in the market, meanwhile, to accommodate the users’ own preference, models are personalized to fit users’ own distribution locally. Traditional full model tuning or transfer learning for such replacements often incur considerable local device training costs and necessitate extensive backpropagation within deep transformer layers. Addressing this issue, we propose a novel, lightweight tuning method for personalized NLP classification tasks post-backbone replacement. Our approach leverages a personalized matrix calculated from documents corresponding to users’ old and new backbones. This matrix facilitates top-layer parameter tuning, drastically reducing backpropagation computation. To further mitigate training costs associated with matrix linear optimization, we employ correlation clustering to curate a few examples from personalized cluster sets for individuals. Our method achieves over 1000 times computation reduction in Flops for backpropagation and brings the user-specific initialization for personal matrix yielding significant performance boost compared with popular transfer learning methods.
“近年来,预训练语言模型受到了广泛的关注,这些模型极大地促进了自然语言处理在不同下游任务中的应用。文本摘要作为自然语言处理中的一个重要分支,可以有效的减少冗余信息,从而提高浏览文本速度。藏文作为低资源语言,缺乏用于大规模的训练语料,藏文生成式文本摘要研究还处于起步阶段,为了解决藏文生成式文本摘要的问题,本文首次提出将端到端的预训练语言模型CMPT(Chinese Minority Pre-Trained Language Model)用于藏文生成式文本摘要研究,CMPT模型通过对其他不同低资源语言文本进行去噪和对比学习,同时为了提高编码器的理解能力,在编码器的输出层增加一个单层掩码语言模型(MLM)解码器,进行Seq2Seq的生成和理解的联合预训练。通过进一步微调可以有效地提高在藏文文本摘要任务上的性能。为了验证模型的性能,我们在自己构建的5w条藏文文本摘要数据集和公开数据集Ti-SUM上进行实验,在两个数据集上的实验表明,我们提出的方法在藏文生成式文本摘要的评测指标上有显著提升。同时,该方法不仅可以应用于藏文文本摘要任务,也可以拓展到其他语言的文本摘要任务中,具有较好的推广价值。”
In this work, we investigate the problems of semantic parsing in a few-shot learning setting. In this setting, we are provided with k utterance-logical form pairs per new predicate. The state-of-the-art neural semantic parsers achieve less than 25% accuracy on benchmark datasets when k = 1. To tackle this problem, we proposed to i) apply a designated meta-learning method to train the model; ii) regularize attention scores with alignment statistics; iii) apply a smoothing technique in pretraining. As a result, our method consistently outperforms all the baselines in both one and two-shot settings.
Semantic parsing maps natural language (NL) utterances into logical forms (LFs), which underpins many advanced NLP problems. Semantic parsers gain performance boosts with deep neural networks, but inherit vulnerabilities against adversarial examples. In this paper, we provide the first empirical study on the robustness of semantic parsers in the presence of adversarial attacks. Formally, adversaries of semantic parsing are considered to be the perturbed utterance-LF pairs, whose utterances have exactly the same meanings as the original ones. A scalable methodology is proposed to construct robustness test sets based on existing benchmark corpora. Our results answered five research questions in measuring the sate-of-the-art parsers’ performance on robustness test sets, and evaluating the effect of data augmentation.