ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense
Mohammadmostafa Rostamkhani, Shayan Mousavinia, Sauleh Eetemadi
Abstract
In our exploration of SemEval 2024 Task 9, specifically the challenging BRAINTEASER: A Novel Task Defying Common Sense, we employed various strategies for the BRAINTEASER QA task, which encompasses both sentence and word puzzles. In the initial approach, we applied the XLM-RoBERTa model both to the original training dataset and concurrently to the original dataset alongside the BiRdQA dataset and the original dataset alongside RiddleSense for comprehensive model training.Another strategy involved expanding each word within our BiRdQA dataset into a full sentence. This unique perspective aimed to enhance the semantic impact of individual words in our training regimen for word puzzle (WP) riddles. Utilizing ChatGPT-3.5, we extended each word into an extensive sentence, applying this process to all options within each riddle.Furthermore, we explored the implementation of RECONCILE (Round-table conference) using three prominent large language models—ChatGPT, Gemini, and the Mixtral-8x7B Large Language Model (LLM). As a final approach, we leveraged GPT-4 results. Remarkably, our most successful experiment yielded noteworthy results, achieving a score of 0.900 for sentence puzzles (S_ori) and 0.906 for word puzzles (W_ori).- Anthology ID:
- 2024.semeval-1.150
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1038–1042
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.150
- DOI:
- 10.18653/v1/2024.semeval-1.150
- Cite (ACL):
- Mohammadmostafa Rostamkhani, Shayan Mousavinia, and Sauleh Eetemadi. 2024. ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1038–1042, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- ROSHA at SemEval-2024 Task 9: BRAINTEASER A Novel Task Defying Common Sense (Rostamkhani et al., SemEval 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.150.pdf