More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge
Sixing Wu, Ying Li, Minghui Wang, Dawei Zhang, Yang Zhou, Zhonghai Wu
Abstract
Despite achieving remarkable performance, previous knowledge-enhanced works usually only use a single-source homogeneous knowledge base of limited knowledge coverage. Thus, they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries. This paper proposes a novel dialogue generation model, MSKE-Dialog, to solve this issue with three unique advantages: (1) Rather than only one, MSKE-Dialog can simultaneously leverage multiple heterogeneous knowledge sources (it includes but is not limited to commonsense knowledge facts, text knowledge, infobox knowledge) to improve the knowledge coverage; (2) To avoid the topic conflict among the context and different knowledge sources, we propose a Multi-Reference Selection to better select context/knowledge; (3) We propose a Multi-Reference Generation to generate informative responses by referring to multiple generation references at the same time. Extensive evaluations on a Chinese dataset show the superior performance of this work against various state-of-the-art approaches. To our best knowledge, this work is the first to use the multi-source heterogeneous knowledge in the open-domain knowledge-enhanced dialogue generation.- Anthology ID:
- 2021.emnlp-main.175
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2286–2300
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.175
- DOI:
- 10.18653/v1/2021.emnlp-main.175
- Cite (ACL):
- Sixing Wu, Ying Li, Minghui Wang, Dawei Zhang, Yang Zhou, and Zhonghai Wu. 2021. More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2286–2300, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge (Wu et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.emnlp-main.175.pdf
- Code
- pku-sixing/emnlp2021-mske_dialog
- Data
- ConceptNet