EurekaRebus - Verbalized Rebus Solving with LLMs: A CALAMITA Challenge

Gabriele Sarti; Tommaso Caselli; Arianna Bisazza; Malvina Nissim

EurekaRebus - Verbalized Rebus Solving with LLMs: A CALAMITA Challenge

Gabriele Sarti, Tommaso Caselli, Arianna Bisazza, Malvina Nissim

Abstract

Language games can be valuable resources for testing the ability of large language models (LLMs) to conduct challenging multi-step, knowledge-intensive inferences while respecting predefined constraints. Our proposed challenge prompts LLMs to reason step-by-step to solve verbalized variants of rebus games recently introduced with the EurekaRebus dataset. Verbalized rebuses replace visual cues with crossword definitions to create an encrypted first pass, making the problem entirely text-based. We introduce a simplified task variant with word length hints and adopt a comprehensive set of metrics to obtain a granular overview of models’ performance in knowledge recall, constraints adherence, and re-segmentation abilities across reasoning steps.

Anthology ID:: 2024.clicit-1.132
Volume:: Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:: December
Year:: 2024
Address:: Pisa, Italy
Editors:: Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:: CLiC-it
SIG:
Publisher:: CEUR Workshop Proceedings
Note:
Pages:: 1202–1208
Language:
URL:: https://preview.aclanthology.org/Add-Cong-Liu-Florida-Atlantic-University-author-id/2024.clicit-1.132/
DOI:
Bibkey:
Cite (ACL):: Gabriele Sarti, Tommaso Caselli, Arianna Bisazza, and Malvina Nissim. 2024. EurekaRebus - Verbalized Rebus Solving with LLMs: A CALAMITA Challenge. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 1202–1208, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):: EurekaRebus - Verbalized Rebus Solving with LLMs: A CALAMITA Challenge (Sarti et al., CLiC-it 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/Add-Cong-Liu-Florida-Atlantic-University-author-id/2024.clicit-1.132.pdf

PDF Cite Search Fix data