# MAPLE
## In-domain seq2seq training
Use `./maple/seq2seq.py` to train the seq2seq model with LoRA. 
Run `run_seq2seq_ablation.sh` to train the seq2seq model, if testing with RL or supervised fine-tuning.  

## SemSim transformation
Use `aggregate_semsim.py` to run SemSim metric on the seq2seq generation mutations. Use `--dataset_train` and `--dataset_test` to specify the dataset to use; 
`--output_dir` to set where the results are saved; use `seq2seq_model_path` to specify the path to the generations; use `smodel` to specify the sentence transformer model to use.
### other metrics
Use `aggregate_other_metrics.py` to run alternative metrics. Available ones include `bleu`, `rouge`, `meteor`, `bleurt`, `sacrebleu` and `bartscore`. Use `--experiment_id` to run multiple processes.

## LR training
Use `LR_train.py` to train the LR model. Use `dataset` to specify the dataset to use; `output_dir` to set where the results are saved. 
By default, the script runs the experiments 100 times. Use `--testing` to run the experiments only once.
Other parameters can be found in the script.

## To reproduce MAPLE experiments in one go
Use `run_MAPLE.sh`.


# Baselines
Use `run_SEED.sh` to reproduce experiments on the SEED baselines, which calls `SEED.py`. The results are saved in `seed_results` directory.

Use `run_PET.sh` to reproduce experiments on the PET baselines, which uses files in `pet` directory. The results are saved in `pet_results` directory.

Use `run_llama2.sh` to reproduce experiments on the LLaMA 2 baselines, which calls `llama2.py`. The results are saved in `llama2_results` directory.

