Xinxian Huang
2022
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
Siqi Bao
|
Huang He
|
Fan Wang
|
Hua Wu
|
Haifeng Wang
|
Wenquan Wu
|
Zhihua Wu
|
Zhen Guo
|
Hua Lu
|
Xinxian Huang
|
Xin Tian
|
Xinchao Xu
|
Yingzhan Lin
|
Zheng-Yu Niu
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.
2021
PLATO-KAG: Unsupervised Knowledge-Grounded Conversation via Joint Modeling
Xinxian Huang
|
Huang He
|
Siqi Bao
|
Fan Wang
|
Hua Wu
|
Haifeng Wang
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI
Large-scale conversation models are turning to leveraging external knowledge to improve the factual accuracy in response generation. Considering the infeasibility to annotate the external knowledge for large-scale dialogue corpora, it is desirable to learn the knowledge selection and response generation in an unsupervised manner. In this paper, we propose PLATO-KAG (Knowledge-Augmented Generation), an unsupervised learning approach for end-to-end knowledge-grounded conversation modeling. For each dialogue context, the top-k relevant knowledge elements are selected and then employed in knowledge-grounded response generation. The two components of knowledge selection and response generation are optimized jointly and effectively under a balanced objective. Experimental results on two publicly available datasets validate the superiority of PLATO-KAG.