Abstract
Recently, empathetic dialogue systems have received significant attention.While some researchers have noted limitations, e.g., that these systems tend to generate generic utterances, no study has systematically verified these issues. We survey 21 systems, asking what progress has been made on the task. We observe multiple limitations of current evaluation procedures. Most critically, studies tend to rely on a single non-reproducible empathy score, which inadequately reflects the multidimensional nature of empathy. To better understand the differences between systems, we comprehensively analyze each system with automated methods that are grounded in a variety of aspects of empathy. We find that recent systems lack three important aspects of empathy: specificity, reflection levels, and diversity. Based on our results, we discuss problematic behaviors that may have gone undetected in prior evaluations, and offer guidance for developing future systems.- Anthology ID:
- 2024.eacl-long.11
- Volume:
- Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 179–189
- Language:
- URL:
- https://aclanthology.org/2024.eacl-long.11
- DOI:
- Cite (ACL):
- Andrew Lee, Jonathan Kummerfeld, Larry Ann, and Rada Mihalcea. 2024. A Comparative Multidimensional Analysis of Empathetic Systems. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 179–189, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- A Comparative Multidimensional Analysis of Empathetic Systems (Lee et al., EACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2024.eacl-long.11.pdf