The R-U-A-Robot Dataset: Helping Avoid Chatbot Deception by Detecting User Questions About Human or Non-Human Identity

David Gros, Yu Li, Zhou Yu


Abstract
Humans are increasingly interacting with machines through language, sometimes in contexts where the user may not know they are talking to a machine (like over the phone or a text chatbot). We aim to understand how system designers and researchers might allow their systems to confirm its non-human identity. We collect over 2,500 phrasings related to the intent of “Are you a robot?”. This is paired with over 2,500 adversarially selected utterances where only confirming the system is non-human would be insufficient or disfluent. We compare classifiers to recognize the intent and discuss the precision/recall and model complexity tradeoffs. Such classifiers could be integrated into dialog systems to avoid undesired deception. We then explore how both a generative research model (Blender) as well as two deployed systems (Amazon Alexa, Google Assistant) handle this intent, finding that systems often fail to confirm their non-human identity. Finally, we try to understand what a good response to the intent would be, and conduct a user study to compare the important aspects when responding to this intent.
Anthology ID:
2021.acl-long.544
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6999–7013
Language:
URL:
https://aclanthology.org/2021.acl-long.544
DOI:
10.18653/v1/2021.acl-long.544
Bibkey:
Cite (ACL):
David Gros, Yu Li, and Zhou Yu. 2021. The R-U-A-Robot Dataset: Helping Avoid Chatbot Deception by Detecting User Questions About Human or Non-Human Identity. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6999–7013, Online. Association for Computational Linguistics.
Cite (Informal):
The R-U-A-Robot Dataset: Helping Avoid Chatbot Deception by Detecting User Questions About Human or Non-Human Identity (Gros et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2021.acl-long.544.pdf
Video:
 https://preview.aclanthology.org/paclic-22-ingestion/2021.acl-long.544.mp4
Data
ETHICS