Weihao Zeng


2024

pdf
Automatic Instruction Evolving for Large Language Models
Weihao Zeng | Can Xu | Yingxiu Zhao | Jian-Guang Lou | Weizhu Chen
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Fine-tuning large pre-trained language models with Evol-Instruct has achieved encouraging results across a wide range of tasks. However, designing effective evolving methods for instruction evolution requires substantial human expertise. This paper proposes Auto Evol-Instruct, an end-to-end framework that evolves instruction datasets using large language models without any human effort. The framework automatically analyzes and summarizes suitable evolutionary strategies for the given instruction data and iteratively improves the evolving method based on issues exposed during the instruction evolution process. Our extensive experiments demonstrate that the best method optimized by Auto Evol-Instruct outperforms human-designed methods on various benchmarks, including MT-Bench, AlpacaEval, GSM8K, and HumanEval.

pdf
DivTOD: Unleashing the Power of LLMs for Diversifying Task-Oriented Dialogue Representations
Weihao Zeng | Dayuan Fu | Keqing He | Yejie Wang | Yukai Xu | Weiran Xu
Findings of the Association for Computational Linguistics: NAACL 2024

Language models pre-trained on general text have achieved impressive results in diverse fields. Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language models. Current task-oriented dialogue pre-training methods overlook the one-to-many property of conversations, where multiple responses can be appropriate given the same conversation context.In this paper, we propose a novel dialogue pre-training model called DivTOD, which collaborates with LLMs to learn diverse task-oriented dialogue representations. DivTOD guides LLMs in transferring diverse knowledge to smaller models while removing domain knowledge that contradicts task-oriented dialogues. Experiments show that our model outperforms strong TOD baselines on various downstream dialogue tasks and learns the intrinsic diversity of task-oriented dialogues.

pdf
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
Yejie Wang | Keqing He | Guanting Dong | Pei Wang | Weihao Zeng | Muxi Diao | Weiran Xu | Jingang Wang | Mengdi Zhang | Xunliang Cai
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Various instruction finetuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this paper, we introduce a diverse instruction model DolphCoder with self-evaluating for code generation. It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability. Our model achieves superior performance on the HumanEval and MBPP benchmarks, demonstrating new insights for future code instruction tuning work. Our key findings are: (1) Augmenting more diverse responses with more distinct reasoning paths increases the code capability of LLMs. (2) Improving one’s ability to evaluate the correctness of code also enhances their ability to create it.

pdf
BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses
Weihao Zeng | Keqing He | Yejie Wang | Dayuan Fu | Weiran Xu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Pre-trained language models have been successful in many scenarios. However, their usefulness in task-oriented dialogues is limited due to the intrinsic linguistic differences between general text and task-oriented dialogues. Current task-oriented dialogue pre-training methods rely on a contrastive framework, which faces challenges such as selecting true positives and hard negatives, as well as lacking diversity. In this paper, we propose a novel dialogue pre-training model called BootTOD. It learns task-oriented dialogue representations via a self-bootstrapping framework. Unlike contrastive counterparts, BootTOD aligns context and context+response representations and dismisses the requirements of contrastive pairs. BootTOD also uses multiple appropriate response targets to model the intrinsic one-to-many diversity of human conversations. Experimental results show that BootTOD outperforms strong TOD baselines on diverse downstream dialogue tasks.

2023

pdf
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue
Weihao Zeng | Keqing He | Yejie Wang | Chen Zeng | Jingang Wang | Yunsen Xian | Weiran Xu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Pre-trained language models based on general text enable huge success in the NLP scenario. But the intrinsical difference of linguistic patterns between general text and task-oriented dialogues makes existing pre-trained language models less useful in practice. Current dialogue pre-training methods rely on a contrastive framework and face the challenges of both selecting true positives and hard negatives. In this paper, we propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context using a self-training framework. Our intuition is that a good dialogue representation both learns local context information and predicts future information. Extensive experiments on diverse downstream dialogue tasks demonstrate the effectiveness of our model, especially the generalization, robustness, and learning discriminative dialogue representations capabilities.

pdf
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation
Weihao Zeng | Lulu Zhao | Keqing He | Ruotong Geng | Jingang Wang | Wei Wu | Weiran Xu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Existing controllable dialogue generation work focuses on the single-attribute control and lacks generalization capability to out-of-distribution multiple attribute combinations. In this paper, we explore the compositional generalization for multi-attribute controllable dialogue generation where a model can learn from seen attribute values and generalize to unseen combinations. We propose a prompt-based disentangled controllable dialogue generation model, DCG. It learns attribute concept composition by generating attribute-oriented prompt vectors and uses a disentanglement loss to disentangle different attributes for better generalization. Besides, we design a unified reference-free evaluation framework for multiple attributes with different levels of granularities. Experiment results on two benchmarks prove the effectiveness of our method and the evaluation metric.

2022

pdf
Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization
Lulu Zhao | Fujia Zheng | Weihao Zeng | Keqing He | Weiran Xu | Huixing Jiang | Wei Wu | Yanan Wu
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The most advanced abstractive dialogue summarizers lack generalization ability on new domains and the existing researches for domain adaptation in summarization generally rely on large-scale pre-trainings. To explore the lightweight fine-tuning methods for domain adaptation of dialogue summarization, in this paper, we propose an efficient and generalizable Domain-Oriented Prefix-tuning model, which utilizes a domain word initialized prefix module to alleviate domain entanglement and adopts discrete prompts to guide the model to focus on key contents of dialogues and enhance model generalization. We conduct zero-shot experiments and build domain adaptation benchmarks on two multi-domain dialogue summarization datasets, TODSum and QMSum. Adequate experiments and qualitative analysis prove the effectiveness of our methods.

pdf
Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems
Weihao Zeng | Keqing He | Zechen Wang | Dayuan Fu | Guanting Dong | Ruotong Geng | Pei Wang | Jingang Wang | Chaobo Sun | Wei Wu | Weiran Xu
Proceedings of the Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems (SereTOD)

Recent advances in neural approaches greatly improve task-oriented dialogue (TOD) systems which assist users to accomplish their goals. However, such systems rely on costly manually labeled dialogs which are not available in practical scenarios. In this paper, we present our models for Track 2 of the SereTOD 2022 challenge, which is the first challenge of building semisupervised and reinforced TOD systems on a large-scale real-world Chinese TOD dataset MobileCS. We build a knowledge-grounded dialog model to formulate dialog history and local KB as input and predict the system response. And we perform semi-supervised pretraining both on the labeled and unlabeled data. Our system achieves the first place both in the automatic evaluation and human interaction, especially with higher BLEU (+7.64) and Success (+13.6%) than the second place.

2021

pdf
Give the Truth: Incorporate Semantic Slot into Abstractive Dialogue Summarization
Lulu Zhao | Weihao Zeng | Weiran Xu | Jun Guo
Findings of the Association for Computational Linguistics: EMNLP 2021

Abstractive dialogue summarization suffers from a lots of factual errors, which are due to scattered salient elements in the multi-speaker information interaction process. In this work, we design a heterogeneous semantic slot graph with a slot-level mask cross-attention to enhance the slot features for more correct summarization. We also propose a slot-driven beam search algorithm in the decoding process to give priority to generating salient elements in a limited length by “filling-in-the-blanks”. Besides, an adversarial contrastive learning assisting the training process is introduced to alleviate the exposure bias. Experimental performance on different types of factual errors shows the effectiveness of our methods and human evaluation further verifies the results..