Abstract
Neural text generation models have achieved remarkable success in carrying on short open-domain conversations. However, their performance degrades significantly in the long term, especially in their ability to ask coherent questions. A significant issue is the generation of redundant questions where the answer has already been provided by the user. We adapt and evaluate different methods, including negative training, decoding, and classification, to mitigate the redundancy problem. We also propose a simple yet effective method for generating training data without the need for crowdsourcing human-human or human-bot conversations. Experiments with the BlenderBot model show that our combined method significantly reduces the rate of redundant questions from 27.2% to 8.7%, while improving the quality of the original model. The code, dataset, and trained models can be found at our repository.- Anthology ID:
- 2023.acl-srw.33
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Vishakh Padmakumar, Gisela Vallejo, Yao Fu
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 226–236
- Language:
- URL:
- https://aclanthology.org/2023.acl-srw.33
- DOI:
- 10.18653/v1/2023.acl-srw.33
- Cite (ACL):
- Long Mai and Julie Carson-berndsen. 2023. I already said that! Degenerating redundant questions in open-domain dialogue systems.. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 226–236, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- I already said that! Degenerating redundant questions in open-domain dialogue systems. (Mai & Carson-berndsen, ACL 2023)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/2023.acl-srw.33.pdf