@inproceedings{svikhnushina-etal-2022-ieval,
    title = "i{E}val: Interactive Evaluation Framework for Open-Domain Empathetic Chatbots",
    author = "Svikhnushina, Ekaterina  and
      Filippova, Anastasiia  and
      Pu, Pearl",
    editor = "Lemon, Oliver  and
      Hakkani-Tur, Dilek  and
      Li, Junyi Jessy  and
      Ashrafzadeh, Arash  and
      Garcia, Daniel Hern{\'a}ndez  and
      Alikhani, Malihe  and
      Vandyke, David  and
      Du{\v{s}}ek, Ond{\v{r}}ej",
    booktitle = "Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue",
    month = sep,
    year = "2022",
    address = "Edinburgh, UK",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2022.sigdial-1.41/",
    doi = "10.18653/v1/2022.sigdial-1.41",
    pages = "419--431",
    abstract = "Building an empathetic chatbot is an important objective in dialog generation research, with evaluation being one of the most challenging parts. By empathy, we mean the ability to understand and relate to the speakers' emotions, and respond to them appropriately. Human evaluation has been considered as the current standard for measuring the performance of open-domain empathetic chatbots. However, existing evaluation procedures suffer from a number of limitations we try to address in our current work. In this paper, we describe iEval, a novel interactive evaluation framework where the person chatting with the bots also rates them on different conversational aspects, as well as ranking them, resulting in greater consistency of the scores. We use iEval to benchmark several state-of-the-art empathetic chatbots, allowing us to discover some intricate details in their performance in different emotional contexts. Based on these results, we present key implications for further improvement of such chatbots. To facilitate other researchers using the iEval framework, we will release our dataset consisting of collected chat logs and human scores."
}Markdown (Informal)
[iEval: Interactive Evaluation Framework for Open-Domain Empathetic Chatbots](https://preview.aclanthology.org/ingest-emnlp/2022.sigdial-1.41/) (Svikhnushina et al., SIGDIAL 2022)
ACL