NumDecoders at SemEval-2024 Task 7: FlanT5 and GPT enhanced with CoT for Numerical Reasoning

Andres Gonzalez, Md Zobaer Hossain, Jahedul Alam Junaed


Abstract
In this paper we present a Chain-of-Thought enhanced solution for large language models, including flanT5 and GPT 3.5 Turbo, aimed at solving mathematical problems to fill in blanks from news headlines. Our approach builds on adata augmentation strategy that incorporates additional mathematical reasoning observations into the original dataset sourced from another mathematical corpus. Both automatic and manual annotations are applied to explicitly describe the reasoning steps required for models to reach the target answer. We employ an ensemble majority voting method to generate finalpredictions across our best-performing models. Our analysis reveals that while larger models trained with our enhanced dataset achieve significant gains (91% accuracy, ranking 5th on the NumEval Task 3 leaderboard), smaller models do not experience improvements and may even see a decrease in overall accuracy. We conclude that improving our automatic an-notations via crowdsourcing methods can be a worthwhile endeavor to train larger models than the ones from this study to see the most accurate results.
Anthology ID:
2024.semeval-1.183
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1260–1268
Language:
URL:
https://aclanthology.org/2024.semeval-1.183
DOI:
Bibkey:
Cite (ACL):
Andres Gonzalez, Md Zobaer Hossain, and Jahedul Alam Junaed. 2024. NumDecoders at SemEval-2024 Task 7: FlanT5 and GPT enhanced with CoT for Numerical Reasoning. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1260–1268, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
NumDecoders at SemEval-2024 Task 7: FlanT5 and GPT enhanced with CoT for Numerical Reasoning (Gonzalez et al., SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.semeval-1.183.pdf
Supplementary material:
 2024.semeval-1.183.SupplementaryMaterial.zip
Supplementary material:
 2024.semeval-1.183.SupplementaryMaterial.zip
Supplementary material:
 2024.semeval-1.183.SupplementaryMaterial.txt