Abstract
Natural language generation systems have witnessed important progress in the last years, but they are shown to generate tokens that are unrelated to the source input. This problem affects computational models in many NLP tasks, and it is particularly unpleasant in multimodal systems. In this work, we assess the rate of object hallucination in multimodal conversational agents playing the GuessWhat?! referential game. Better visual processing has been shown to mitigate this issue in image captioning; hence, we adapt to the GuessWhat?! task the best visual processing models at disposal, and propose two new models to play the Questioner agent. We show that the new models generate few hallucinations compared to other renowned models available in the literature. Moreover, their hallucinations are less severe (affect task-accuracy less) and are more human-like. We also analyse where hallucinations tend to occur more often through the dialogue: hallucinations are less frequent in earlier turns, cause a cascade hallucination effect, and are often preceded by negative answers, which have been shown to be harder to ground.- Anthology ID:
- 2021.acl-srw.11
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 101–111
- Language:
- URL:
- https://aclanthology.org/2021.acl-srw.11
- DOI:
- 10.18653/v1/2021.acl-srw.11
- Cite (ACL):
- Alberto Testoni and Raffaella Bernardi. 2021. “I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?!. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 101–111, Online. Association for Computational Linguistics.
- Cite (Informal):
- “I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?! (Testoni & Bernardi, ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2021.acl-srw.11.pdf
- Data
- GuessWhat?!, MS COCO