@inproceedings{li-etal-2024-exploring-mathematical,
    title = "Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data",
    author = "Li, Haolong  and
      Ma, Yu  and
      Zhang, Yinqi  and
      Ye, Chen  and
      Chen, Jie",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.findings-acl.55/",
    doi = "10.18653/v1/2024.findings-acl.55",
    pages = "936--946",
    abstract = "While large language models (LLMs) have shown excellent capabilities in language understanding, text generation and many other tasks, they still struggle in complex multi-step reasoning problems such as mathematical reasoning. In this paper, through a newly proposed arithmetical puzzle problem, we show that the model can perform well on multi-step reasoning tasks via fine tuning on high-quality synthetic data. Experiments with the open-llama-3B model on three different test datasets show that not only the model can reach a zero-shot pass@1 at 0.44 on the in-domain dataset, it also demonstrates certain generalization capabilities on the out-of-domain datasets. Specifically, this paper has designed two out-of-domain datasets in the form of extending the numerical range and the composing components of the arithmetical puzzle problem separately. The fine-tuned model have shown encouraging performance on these two far more difficult tasks with the zero-shot pass@1 at 0.33 and 0.35 correspondingly."
}Markdown (Informal)
[Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data](https://preview.aclanthology.org/ingest-emnlp/2024.findings-acl.55/) (Li et al., Findings 2024)
ACL