Yujun Chen


2022

pdf
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Hanwei Xu | Yujun Chen | Yulun Du | Nan Shao | Wang Yanggang | Haiyu Li | Zhilin Yang
Findings of the Association for Computational Linguistics: EMNLP 2022

We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting.While previous models are trained on only a few dozen tasks, we scale to 1,000 tasks for the first time using real-world data. This leads to a crucial discovery that task scaling can be an efficient alternative to model scaling; i.e., the model size has less impact on performance with an extremely large number of tasks. Our results show that task scaling can improve training efficiency by 30 times in FLOPs.Empirically, ZeroPrompt substantially improves both the efficiency and the performance of zero-shot learning across a variety of academic and production datasets.

pdf
Learning to Detect Noisy Labels Using Model-Based Features
Zhihao Wang | Zongyu Lin | Junjie Wen | Xianxin Chen | Peiqi Liu | Guidong Zheng | Yujun Chen | Zhilin Yang
Findings of the Association for Computational Linguistics: EMNLP 2022

Label noise is ubiquitous in various machine learning scenarios such as self-labeling with model predictions and erroneous data annotation. Many existing approaches are based on heuristics such as sample losses, which might not be flexible enough to achieve optimal solutions. Meta learning based methods address this issue by learning a data selection function, but can be hard to optimize. In light of these pros and cons, we propose SENT (Selection-Enhanced Noisy label Training) that does not rely on meta learning while having the flexibility of being data-driven. SENT transfers the noise distribution to a clean set and trains a model to distinguish noisy labels from clean ones using model-based features. Empirically, on a wide range of tasks including text classification and speech recognition, SENT improves performance over strong baselines under the settings of self-training and label corruption.

pdf
GPS: Genetic Prompt Search for Efficient Few-Shot Learning
Hanwei Xu | Yujun Chen | Yulun Du | Nan Shao | Wang Yanggang | Haiyu Li | Zhilin Yang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Prompt-based techniques have demostrated great potential for improving the few-shot generalization of pretrained language models. However, their performance heavily relies on the manual design of prompts and thus requiring a lot of human efforts. In this paper, we introduce Genetic Prompt Search (GPS) to improve few-shot learning with prompts, which utilizes a genetic algorithm to automatically search for the best prompt.GPS is gradient-free and requires no update of model parameters but only a small validation set. Experiments on diverse datasets proved the effectiveness of GPS, which outperforms manual prompts by a large margin of 2.6 points. Our method is also better than other parameter-efficient tuning methods such as prompt tuning.