ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense

Mohammadmostafa Rostamkhani; Shayan Mousavinia; Sauleh Eetemadi

ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense

Mohammadmostafa Rostamkhani, Shayan Mousavinia, Sauleh Eetemadi

Abstract

In our exploration of SemEval 2024 Task 9, specifically the challenging BRAINTEASER: A Novel Task Defying Common Sense, we employed various strategies for the BRAINTEASER QA task, which encompasses both sentence and word puzzles. In the initial approach, we applied the XLM-RoBERTa model both to the original training dataset and concurrently to the original dataset alongside the BiRdQA dataset and the original dataset alongside RiddleSense for comprehensive model training.Another strategy involved expanding each word within our BiRdQA dataset into a full sentence. This unique perspective aimed to enhance the semantic impact of individual words in our training regimen for word puzzle (WP) riddles. Utilizing ChatGPT-3.5, we extended each word into an extensive sentence, applying this process to all options within each riddle.Furthermore, we explored the implementation of RECONCILE (Round-table conference) using three prominent large language models—ChatGPT, Gemini, and the Mixtral-8x7B Large Language Model (LLM). As a final approach, we leveraged GPT-4 results. Remarkably, our most successful experiment yielded noteworthy results, achieving a score of 0.900 for sentence puzzles (S_ori) and 0.906 for word puzzles (W_ori).

Anthology ID:: 2024.semeval-1.150
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1038–1042
Language:
URL:: https://aclanthology.org/2024.semeval-1.150
DOI:
Bibkey:
Cite (ACL):: Mohammadmostafa Rostamkhani, Shayan Mousavinia, and Sauleh Eetemadi. 2024. ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1038–1042, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense (Rostamkhani et al., SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.semeval-1.150.pdf
Supplementary material:: 2024.semeval-1.150.SupplementaryMaterial.txt
Supplementary material:: 2024.semeval-1.150.SupplementaryMaterial.zip

PDF Search Supplementary material Supplementary material