Abstract
Pre-training models have been proved effective for a wide range of natural language processing tasks. Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering. In this framework, we adopt flexible attention mechanisms to fully leverage the bi-directional context and the uni-directional characteristic of language generation. We also introduce discrete latent variables to tackle the inherent one-to-many mapping problem in response generation. Two reciprocal tasks of response generation and latent act recognition are designed and carried out simultaneously within a shared network. Comprehensive experiments on three publicly available datasets verify the effectiveness and superiority of the proposed framework.- Anthology ID:
- 2020.acl-main.9
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 85–96
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.9
- DOI:
- 10.18653/v1/2020.acl-main.9
- Cite (ACL):
- Siqi Bao, Huang He, Fan Wang, Hua Wu, and Haifeng Wang. 2020. PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 85–96, Online. Association for Computational Linguistics.
- Cite (Informal):
- PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable (Bao et al., ACL 2020)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2020.acl-main.9.pdf
- Code
- PaddlePaddle/Research + additional community code
- Data
- DailyDialog