Is It Smaller Than a Tennis Ball? Language Models Play the Game of Twenty Questions
Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans
Abstract
Researchers often use games to analyze the abilities of Artificial Intelligence models. In this work, we use the game of Twenty Questions to study the world knowledge of language models. Despite its simplicity for humans, this game requires a broad knowledge of the world to answer yes/no questions. We evaluate several language models on this task and find that only the largest model has enough world knowledge to play it well, although it still has difficulties with the shape and size of objects. We also present a new method to improve the knowledge of smaller models by leveraging external information from the web. Finally, we release our dataset and Twentle, a website to interactively test the knowledge of language models by playing Twenty Questions.- Anthology ID:
- 2022.blackboxnlp-1.7
- Volume:
- Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates (Hybrid)
- Editors:
- Jasmijn Bastings, Yonatan Belinkov, Yanai Elazar, Dieuwke Hupkes, Naomi Saphra, Sarah Wiegreffe
- Venue:
- BlackboxNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 80–90
- Language:
- URL:
- https://aclanthology.org/2022.blackboxnlp-1.7
- DOI:
- 10.18653/v1/2022.blackboxnlp-1.7
- Cite (ACL):
- Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, and Walter Daelemans. 2022. Is It Smaller Than a Tennis Ball? Language Models Play the Game of Twenty Questions. In Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 80–90, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Cite (Informal):
- Is It Smaller Than a Tennis Ball? Language Models Play the Game of Twenty Questions (De Bruyn et al., BlackboxNLP 2022)
- PDF:
- https://preview.aclanthology.org/landing_page/2022.blackboxnlp-1.7.pdf