Contrastive Response Pairs for Automatic Evaluation of Non-task-oriented Neural Conversational Models
Koshiro Okano, Yu Suzuki, Masaya Kawamura, Tsuneo Kato, Akihiro Tamura, Jianming Wu
Abstract
Responses generated by neural conversational models (NCMs) for non-task-oriented systems are difficult to evaluate. We propose contrastive response pairs (CRPs) for automatically evaluating responses from non-task-oriented NCMs. We conducted an error analysis on responses generated by an encoder-decoder recurrent neural network (RNN) type NCM and created three types of CRPs corresponding to the three most frequent errors found in the analysis. Three NCMs of different response quality were objectively evaluated with the CRPs and compared to a subjective assessment. The correctness obtained by the three types of CRPs were consistent with the results of the subjective assessment.- Anthology ID:
- 2021.sigdial-1.21
- Volume:
- Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- July
- Year:
- 2021
- Address:
- Singapore and Online
- Editors:
- Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 202–207
- Language:
- URL:
- https://aclanthology.org/2021.sigdial-1.21
- DOI:
- 10.18653/v1/2021.sigdial-1.21
- Cite (ACL):
- Koshiro Okano, Yu Suzuki, Masaya Kawamura, Tsuneo Kato, Akihiro Tamura, and Jianming Wu. 2021. Contrastive Response Pairs for Automatic Evaluation of Non-task-oriented Neural Conversational Models. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 202–207, Singapore and Online. Association for Computational Linguistics.
- Cite (Informal):
- Contrastive Response Pairs for Automatic Evaluation of Non-task-oriented Neural Conversational Models (Okano et al., SIGDIAL 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2021.sigdial-1.21.pdf