Treating Dialogue Quality Evaluation as an Anomaly Detection Problem

Rostislav Nedelchev, Ricardo Usbeck, Jens Lehmann


Abstract
Dialogue systems for interaction with humans have been enjoying increased popularity in the research and industry fields. To this day, the best way to estimate their success is through means of human evaluation and not automated approaches, despite the abundance of work done in the field. In this paper, we investigate the effectiveness of perceiving dialogue evaluation as an anomaly detection task. The paper looks into four dialogue modeling approaches and how their objective functions correlate with human annotation scores. A high-level perspective exhibits negative results. However, a more in-depth look shows some potential for using anomaly detection for evaluating dialogues.
Anthology ID:
2020.lrec-1.64
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
508–512
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.64
DOI:
Bibkey:
Cite (ACL):
Rostislav Nedelchev, Ricardo Usbeck, and Jens Lehmann. 2020. Treating Dialogue Quality Evaluation as an Anomaly Detection Problem. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 508–512, Marseille, France. European Language Resources Association.
Cite (Informal):
Treating Dialogue Quality Evaluation as an Anomaly Detection Problem (Nedelchev et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.lrec-1.64.pdf