Abstract
While large language models (LLMs) have shown excellent capabilities in language understanding, text generation and many other tasks, they still struggle in complex multi-step reasoning problems such as mathematical reasoning. In this paper, through a newly proposed arithmetical puzzle problem, we show that the model can perform well on multi-step reasoning tasks via fine tuning on high-quality synthetic data. Experiments with the open-llama-3B model on three different test datasets show that not only the model can reach a zero-shot pass@1 at 0.44 on the in-domain dataset, it also demonstrates certain generalization capabilities on the out-of-domain datasets. Specifically, this paper has designed two out-of-domain datasets in the form of extending the numerical range and the composing components of the arithmetical puzzle problem separately. The fine-tuned model have shown encouraging performance on these two far more difficult tasks with the zero-shot pass@1 at 0.33 and 0.35 correspondingly.- Anthology ID:
- 2024.findings-acl.55
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 936–946
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-acl.55/
- DOI:
- 10.18653/v1/2024.findings-acl.55
- Cite (ACL):
- Haolong Li, Yu Ma, Yinqi Zhang, Chen Ye, and Jie Chen. 2024. Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data. In Findings of the Association for Computational Linguistics: ACL 2024, pages 936–946, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data (Li et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-acl.55.pdf