# Self-training Language Models in Arithmetic Reasoning

This repository contains the code for the submission of "Self-training Language Models in Arithmetic Reasoning"

Our experiments strive to find out whether the accuracy of models trained with supervised data can be improved by **self-training**, while using **only a subset of the original training data**. We fine-tune pretrained models for arithmetical reasoning by the authors of [this related work](https://aclanthology.org/2023.emnlp-main.742/).


## Usage 

- `examples/predict_calc.py` - script for creating predictions (for either offline selftraining or testing)
- `examples/test_calc.py` - script for evaluation
- `examples/selftrain_calc_offline_po.py` - script for offline selftraining with PO
- `examples/selftrain_calc_offline_sft.py` - script for offline selftraining with SFT
- `examples/selftrain_calc_online.py` - script for online selftraining with SFT and PO


## Authorship

We declare that parts of this code were taken or adapted from the repository of [this related work](https://aclanthology.org/2023.emnlp-main.742/).

The original repository served for supervised training of language models for arithmetical reasoning. We provide our own implementations for self-training, but reuse or adapt the original utilities related to inference, parsing and evaluation.

Specifically, our constributions are:
- `examples/predict_calc.py` - we adapt the original prediction script to be suitable for generating offline self-training dataset 
- `examples/selftrain_calc_offline_sft.py` - we adapt the original training script for offline supervised self-training on models predictions
- `examples/selftrain_calc_offline_po.py` - we create an offline training script for self-training with preference optimization
- `examples/selftrain_calc_online.py` - we create a training script for online selftraining where the model is trained on the fly on its own predictions
- `gadgets/selftrain.py` - we create self-training utilities (e.g., data processing, tracking progress, etc.)

