Abstract
The successful deployment of large language models in numerous NLP tasks has spurred the demand for tackling more complex tasks, which were previously unattainable. SemEval-2024 Task 9 introduces the brainteaser dataset that necessitates intricate, human-like reasoning to solve puzzles that challenge common sense. At first glance, the riddles in the dataset may appear trivial for humans to solve. However, these riddles demand lateral thinking, which deviates from vertical thinking that is the dominant form when it comes to current reasoning tasks. In this paper, we examine the ability of current state-of-the-art LLMs to solve this task. Our study is diversified by selecting both open and closed source LLMs with varying numbers of parameters. Additionally, we extend the task dataset with synthetic explanations derived from the LLMs’ reasoning processes during task resolution. These could serve as a valuable resource for further expanding the task dataset and developing more robust methods for tasks that require complex reasoning. All the codes and datasets are available in paper’s GitHub repository.- Anthology ID:
- 2024.semeval-1.264
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1889–1893
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.264/
- DOI:
- 10.18653/v1/2024.semeval-1.264
- Cite (ACL):
- Erfan Moosavi Monazzah and Mahdi Feghhi. 2024. Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1889–1893, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Zero Shot is All You Need at SemEval-2024 Task 9: A study of State of the Art LLMs on Lateral Thinking Puzzles (Moosavi Monazzah & Feghhi, SemEval 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.264.pdf