# Implementation of PropmtKD

This repository is written based on ICLR 2024 MiniLLM paper. ([paper](https://openreview.net/pdf?id=5h0qf7IBZZ))


### Prerequisites

See `install.sh`

```
conda create -n promptkd python=3.8
conda activate promptkd
bash install.sh
```

### Quick start 

Firstly, we provide the dataset and model checkpoints along with the code for reproducibility of our paper. 
Due to the file size, we will distribute these via an **anonymous** Google Drive, as submitting them to OpenReview would be impractical.

To proceed, follow the steps below to download the files and place them in the appropriate directory:
1. Download checkpoint files (used in our paper) in Google Drive: (https://drive.google.com/file/d/1nNUHuDpld_skFZDSLlFKFLa9MFy3Fbci/view?usp=sharing) (This account is anonymous account)
2. Unzip 'checkpoint.zip' and put in "results/gpt2/train" path folder like "~/PromptKD/results/gpt2/train/checkpoint"
3. Download start_checkpoint files (for training) in Google Drive: (https://drive.google.com/file/d/1KeaD1ypdrSvmg5Lup3NAt0TxYsr0FHi1/view?usp=sharing) (This account is anonymous account)
4. Unzip 'start_checkpoint.zip' and put in "PromptKD" path folder like "~/PromptKD/start_checkpoint"
5. Download processed data (for training): (https://conversationhub.blob.core.windows.net/beit-share-public/MiniLLM/processed_data.tar?sv=2021-10-04&st=2023-06-08T11%3A16%3A02Z&se=2033-06-09T11%3A16%3A00Z&sr=c&sp=r&sig=N4pfCVmSeq4L4tS8QbrFVsX6f6q844eft8xSuXdxU48%3D) (This link is in MiniLLM github README.md)
6. Unzip "processed_data.tar" and put in PromptKD folder like "~/PromptKD/processed_data" 
7. Download data (for evaluation): (https://conversationhub.blob.core.windows.net/beit-share-public/MiniLLM/data.tar?sv=2021-10-04&st=2023-06-08T11%3A16%3A02Z&se=2033-06-09T11%3A16%3A00Z&sr=c&sp=r&sig=N4pfCVmSeq4L4tS8QbrFVsX6f6q844eft8xSuXdxU48%3D) (This link is in MiniLLM github README.md)
8. Unzip "data.tar" and put in PromptKD folder like "~/PromptKD/processed_data" 


From now on, you can utilize the following script to both train and evaluate the model:

### Reproduce the performance of the model checkpoint used in the paper.
```
bash scripts/gpt2/eval/run_eval.sh . checkpoint/gpt2-{size}/{method}
```
You can choose size in {base, medium, large}, and method in {sft, kd, seqkd, gkd, minillm, promptkd}



### Train the model directly to reproduce the results of the paper.

Before you train your own model, you should prepare teacher and student model.

For teacher, you should find supervised fine-tuned gpt2-xlarge ("~/PromptKD/start_checkpoint/teacher/gpt2-xlarge") and put in "~/PromptKD/checkpoints/gpt2/train/sft/gpt2-xlarge" path

For student, 
if method is in [sft, kd, seqkd]: 
    you should find pre-trained gpt2-{size} ("~/PromptKD/start_checkpoint/student/pre-trained/gpt2-{size}") and put in "~/PromptKD/checkpoints/gpt2/gpt2-{size}" path
else if method is in [gkd, minillm, promptkd]: 
    you should find (only 3 epochs) supervised fine-tuned gpt2-{size} ("~/PromptKD/start_checkpoint/student/fine-tuned/gpt2-{size}") and put in "~/PromptKD/checkpoints/gpt2/train/sft_init/gpt2-{size}" path

Finally, you can use any script as below.
```
bash scripts/gpt2/{method}/{method}_{size}.sh
bash scripts/gpt2/eval/run_eval.sh . {method}/gpt2-{size}/{your_exp_folder}/best_rougeL
```
You can choose size in {base, medium, large}, and method in {sft, kd, seqkd, gkd, minillm, promptkd}


### About Llama and OPT models

The scripts used to train OPT and Llama have all been added to the scripts folder, but the checkpoints could not be included due to the storage limitations of the anonymous Google Drive account. In the future, we will upload the models to the Hugging Face Hub to make all student models trained with PromptKD publicly available.