Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles

Erfan Moosavi Monazzah; Mahdi Feghhi

doi:10.18653/v1/2024.semeval-1.264

Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles

Abstract

The successful deployment of large language models in numerous NLP tasks has spurred the demand for tackling more complex tasks, which were previously unattainable. SemEval-2024 Task 9 introduces the brainteaser dataset that necessitates intricate, human-like reasoning to solve puzzles that challenge common sense. At first glance, the riddles in the dataset may appear trivial for humans to solve. However, these riddles demand lateral thinking, which deviates from vertical thinking that is the dominant form when it comes to current reasoning tasks. In this paper, we examine the ability of current state-of-the-art LLMs to solve this task. Our study is diversified by selecting both open and closed source LLMs with varying numbers of parameters. Additionally, we extend the task dataset with synthetic explanations derived from the LLMs’ reasoning processes during task resolution. These could serve as a valuable resource for further expanding the task dataset and developing more robust methods for tasks that require complex reasoning. All the codes and datasets are available in paper’s GitHub repository.

Anthology ID:: 2024.semeval-1.264
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1889–1893
Language:
URL:: https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.264/
DOI:: 10.18653/v1/2024.semeval-1.264
Bibkey:
Cite (ACL):: Erfan Moosavi Monazzah and Mahdi Feghhi. 2024. Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1889–1893, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles (Moosavi Monazzah & Feghhi, SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.264.pdf
Supplementarymaterial:: 2024.semeval-1.264.SupplementaryMaterial.txt
Supplementarymaterial:: 2024.semeval-1.264.SupplementaryMaterial.zip

PDF Search Supplementarymaterial Supplementarymaterial Fix metadata