# Usage

## Building and logging into your Docker environment

1. Execute `docker build` command.
```bash
cd docker/$appropriate_env_dockerfile_dir
docker build -t wa_msf_img:latest .
```

2. Execute `docker run` command to launch the container.
```bash
cd ../../
docker run --runtime=nvidia -d -it -v $(pwd):/app --name wa_msf_env wa_msf_img /bin/bash
```

3. Log into the container.
```bash
docker exec --runtime=nvidia -it wa_msf_env /bin/bash
```

## Execute testing the model

### Prepare the dataset: CMU-MOSI or MOSEI

For textual modalities, you can find the data from /app/data/cmu_{mosi,mosei} directory. However, limited size of acoustic and visual modalities are attached to this archive due to data size constraints.

You need to arrange dataset files like that:
```
data/
├ cmu_mosei/
│ ├ lang/ <= already exists
│ │ ├ dev.tsv
│ │ ├ test.tsv
│ │ └ train.tsv
│ ├ audio_raw.pkl <= partial data, ** need to download and place **
│ └ video_raw.pkl <= partial data, ** need to download and place **
└ cmu_mosi/
   ├ lang/ <= already exists
   │ ├ dev.tsv
   │ ├ test.tsv
   │ └ train.tsv
   ├ audio_raw.pkl <= partial data, ** need to download and place **
   └ video_raw.pkl <= partial data, ** need to download and place **
```

#### Data structure for audio and video modalities

The structure of audio/visual data `.pkl` files for our model is like this:
```python
(train_data, eval_data, test_data)
```
and the structures for each element of the above tuple are like this:
```python
audio = NDArray(NUM_OF_ITEM, 5000, 5, dtype=float)
video = NDArray(NUM_OF_ITEM, 1250, 35, dtype=float)
```
`5000` and `1250` are the maximum lengths of each modality, so those are hyperparameters, respectively. `5` and `35` are the feature sizes of each modality, respectively.

You can download the actual audio/video datasets (tar.gz archive files) from the links below:
CMU-MOSI: https://drive.google.com/file/d/1BUvwfmTeeu36e_sdtgb_sP-qOgDaPUjf/view?usp=drive_link
CMU-MOSEI: https://drive.google.com/file/d/1yCBmQ24VH_Nt4vOYgbJ0vCPYIjp-8o93/view?usp=drive_link

### 1st shot: train BERT separately

As we mentioned in the paper, the BERT model should be trained before training the WA-MSF model entirely.

Execute the below commands to start training BERT:
``` bash
cd /app/pytorch/
python3 run_bert.py -p config/sample/{mosi,mosei}_bertl.json -d out/{mosi,mosei}_bertl -t 100 --gpu_id $as_you_like
```
In the above command, the `-t` option stands for the "training attempts" of the BERT training. For example, the above command means "Perform 100 BERT training trials and save the best one on each trial, for a total of 100 weights".
Also, you can add the `nohup` command to put the above job in the background.

You can select the best weight from those and proceed to the next step.

### 2nd shot: train WA-MSF

Finally, you can train our model WA-MSF.

Execute the below command to start training WA-MSF:
```bash
python3 run.py -p config/sample/{mosi,mosei}_normal.json -d out/{mosi,mosei}_normal -t 100 --gpu_id $as_you_like
```
We have prepared several subsets and variations for testing, so select the suitable configuration file as per your preference. `${dataset}_normal` are the configs for the model with full options.

Note, you need to modify the config file to incorporate your BERT weight generated at step 1. Please specify the location of your BERT weight appropriately at the `pretrained_path` property of the config `.json` file.