Zhen Qian


2024

pdf
ZXQ at SemEval-2024 Task 7: Fine-tuning GPT-3.5-Turbo for Numerical Reasoning
Zhen Qian | Xiaofei Xu | Xiuzhen Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

In this paper, we present our system for the SemEval-2024 Task 7, i.e., NumEval subtask 3: Numericial Reasoning. Given a news article and its headline, the numerical reasoning task involves creating a system to compute the intentionally excluded number within the news headline. We propose a fine-tuned GPT-3.5-turbo model, specifically engineered to deduce missing numerals directly from the content of news article. The model is trained with a human-engineered prompt that itegrates the news content and the masked headline, tailoring its accuracy for the designated task. It achieves an accuracy of 0.94 on the test data and secures the second position in the official leaderboard. An examination on the system’s inference results reveals its commendable accuracy in identifying correct numerals when they can be directly “copied” from the articles. However, the error rates increase when it comes to some ambiguous operations such as rounding.