How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation - ACL Anthology

This is an internal, incomplete preview of a proposed change to the ACL Anthology. For efficiency reasons, we generate only three BibTeX files per volume, and the preview may be incomplete in other ways, or contain mistakes. Do not treat this content as an official publication.

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, Joelle Pineau

Anthology ID:: D16-1230
Volume:: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2016
Address:: Austin, Texas
Editors:: Jian Su, Kevin Duh, Xavier Carreras
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2122–2132
Language:
URL:: https://aclanthology.org/D16-1230
DOI:: 10.18653/v1/D16-1230
Bibkey:
Cite (ACL):: Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2122–2132, Austin, Texas. Association for Computational Linguistics.
Cite (Informal):: How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation (Liu et al., EMNLP 2016)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/D16-1230.pdf
Attachment:: D16-1230.Attachment.zip
Video:: https://preview.aclanthology.org/dois-2013-emnlp/D16-1230.mp4
Code: additional community code

PDF Search Code Attachment Video