Abstract
Current evaluation practices for social dialog systems, dedicated to human-computer spontaneous conversation, exclusively focus on the quality of system-generated surface text, but not human-verifiable aspects of mutual understanding between the systems and their interlocutors. This work proposes Word Sense Disambiguation (WSD) as an essential component of a valid and reliable human evaluation framework, whose long-term goal is to radically improve the usability of dialog systems in real-life human-computer collaboration. The practicality of this proposal is proved via experimentally investigating (1) the WordNet 3.0 sense inventory coverage of lexical meanings in spontaneous conversation between humans in American English, assumed as an upper bound of lexical diversity of human-computer communication, and (2) the effectiveness of state-of-the-art WSD models and pretrained transformer-based contextual embeddings on this type of data.- Anthology ID:
- 2022.humeval-1.10
- Volume:
- Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- HumEval
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 116–125
- Language:
- URL:
- https://aclanthology.org/2022.humeval-1.10
- DOI:
- 10.18653/v1/2022.humeval-1.10
- Cite (ACL):
- Alex Lưu. 2022. Towards Human Evaluation of Mutual Understanding in Human-Computer Spontaneous Conversation: An Empirical Study of Word Sense Disambiguation for Naturalistic Social Dialogs in American English. In Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval), pages 116–125, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Towards Human Evaluation of Mutual Understanding in Human-Computer Spontaneous Conversation: An Empirical Study of Word Sense Disambiguation for Naturalistic Social Dialogs in American English (Lưu, HumEval 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.humeval-1.10.pdf