Abstract
Recent large language models (LLMs) can solve puzzles that require creativity and lateral thinking. To advance this front of research, we tackle SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. We approach this task by introducing a technique that we call Fine-tuned Generated Chain-of-Thought (FtG-CoT). It is a novel few-shot prompting method that combines a fine-tuned BERT classifier encoder with zero-shot chain-of-thought generation and a fine-tuned LLM. The fine-tuned BERT classifier provides a context-rich encoding of each example question and choice list. Zero-shot chain-of-thought generation leverages the benefits of chain-of-thought prompting without requiring manual creation of the reasoning chains. We fine-tune the LLM on the generated chains-of-thought and include a set of generated reasoning chains in the final few-shot LLM prompt to maximize the relevance and correctness of the final generated response. In this paper, we show that FtG-CoT outperforms the zero-shot prompting baseline presented in the task paper and is highly effective at solving challenging sentence puzzles achieving a perfect score on the practice set and a 0.9 score on the evaluation set.- Anthology ID:
- 2024.semeval-1.181
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1245–1251
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.181/
- DOI:
- 10.18653/v1/2024.semeval-1.181
- Cite (ACL):
- Micah Zhang, Shafiuddin Rehan Ahmed, and James H. Martin. 2024. FtG-CoT at SemEval-2024 Task 9: Solving Sentence Puzzles Using Fine-Tuned Language Models and Zero-Shot CoT Prompting. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1245–1251, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- FtG-CoT at SemEval-2024 Task 9: Solving Sentence Puzzles Using Fine-Tuned Language Models and Zero-Shot CoT Prompting (Zhang et al., SemEval 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.semeval-1.181.pdf