Open-Domain Dialog Evaluation Using Follow-Ups Likelihood

Maxime De Bruyn; Ehsan Lotfi; Jeska Buhmann; Walter Daelemans

Open-Domain Dialog Evaluation Using Follow-Ups Likelihood

Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans

Abstract

Automatic evaluation of open-domain dialogs remains an unsolved problem. Existing methods do not correlate strongly with human annotations. In this paper, we present a new automated evaluation method based on the use of follow-ups. We measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g. not really relevant here, what are you trying to say?). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.

Anthology ID:: 2022.coling-1.40
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 496–504
Language:
URL:: https://aclanthology.org/2022.coling-1.40
DOI:
Bibkey:
Cite (ACL):: Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, and Walter Daelemans. 2022. Open-Domain Dialog Evaluation Using Follow-Ups Likelihood. In Proceedings of the 29th International Conference on Computational Linguistics, pages 496–504, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: Open-Domain Dialog Evaluation Using Follow-Ups Likelihood (De Bruyn et al., COLING 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-2024-clasp/2022.coling-1.40.pdf
Code: maximedb/full
Data: FED

PDF Search Code