# Code for Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback

Supervised fine-tuning is conducted in 2 RTX6000A GPU, and RL fine-tuning is conducted in 6 RTX6000A GPUs.
We attach dataset file and the code for training Llama 3 Instruct model for each task.


## Dataset File
- Code, Description, Data Table and Reasoning Step: 
    - Training set: `./prepare-data/Text2Chart-31-train.json`
    - Test set: `./prepare-data/Text2Chart-31-test.json`.
- Dataset file including the figures: 
    - Training set: https://drive.google.com/file/d/11otHdVt7eJqAJ7RJl71G6eFBsKHNYEAM/view?usp=sharing 
    - Test set: https://drive.google.com/file/d/1ckNEhhWA-eGPiGl-j7Mc_5UldsNNtOZX/view?usp=sharing


## LoRA checkpoints
Unzip it under `checkpoint` folder and run inference code.
### Supervised fine-tuned model
1. Task 1: https://drive.google.com/file/d/1DfG4kHO1N4QeG5SMVlVqMpqFQBNr3qhr/view?usp=sharing
2. Task 3: https://drive.google.com/file/d/14Yyju22AXbQ_lakOkzt_eMkv7YN2iKyc/view?usp=sharing

## Reward model checkpoint
1. OPT model: https://drive.google.com/file/d/1W7HsPs4F2Js1l8zO-iCNRiXLcML4AswP/view?usp=sharing

## Training code

### Supervised fine-tuning
1. Task 1: Run `python sft-task1.py`
2. Task 2: Run `python sft-task2.py`
2. Task 3: Run `python sft-task3.py`

### RL fine-tuning
1. Task 1 & Task 3: `python rl-task1-task3.py` (You would need to download reward model/SFT model checkpoints beforehand).

## Inference code

### Task 1
1. Base model : Run `python generate-llama3-base.py`
2. SFT model : Run `python generate-llama3-bf16-sft.py`
3. RL model : Run `python generate-llama3-bf16-rl.py`

### Task 2
1. SFT model : Run `python generate2-llama3-sft.py` (You would need to train the model beforehand).

### Task 3
1. Base model : Run `python generate3-llama3-base.py`
2. SFT model : Run `python generate3-llama3-bf16-sft.py`
3. RL model : Run `python generate3-llama3-bf16-rl.py`

