Abstract
In this contribution, we examine the proficiency of Large Language Models (LLMs) in solving the linguistic game “La Ghigliottina,” the final game of the popular Italian TV quiz show “L’Eredità”. This game is particularly challenging as it requires LLMs to engage in semantic inference reasoning for identifying the solutions of the game. Our experiment draws inspiration from Ghigliottin-AI, a task of EVALITA 2020, an evaluation campaign focusing on Natural Language Processing (NLP) and speech tools designed for the Italian language. To benchmark our experiment, we use the results of the most successful artificial player in this task, namely Il Mago della Ghigliottina. The paper describes the experimental setting and the results which show that LLMs perform poorly.- Anthology ID:
- 2024.games-1.11
- Volume:
- Proceedings of the 10th Workshop on Games and Natural Language Processing @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Chris Madge, Jon Chamberlain, Karen Fort, Udo Kruschwitz, Stephanie Lukin
- Venues:
- games | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 97–106
- Language:
- URL:
- https://aclanthology.org/2024.games-1.11
- DOI:
- Cite (ACL):
- Raffaele Manna, Maria Pia di Buono, and Johanna Monti. 2024. Riddle Me This: Evaluating Large Language Models in Solving Word-Based Games. In Proceedings of the 10th Workshop on Games and Natural Language Processing @ LREC-COLING 2024, pages 97–106, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Riddle Me This: Evaluating Large Language Models in Solving Word-Based Games (Manna et al., games-WS 2024)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2024.games-1.11.pdf