# XeroAlign: Zero-shot Cross-lingual Transformer Alignment
This is the code repository and instructions for the above ACL 2021 paper.

#### Acknowledgements
The starting point for this repo was cloned from [JointBERT](https://github.com/monologg/JointBERT). Thanks for sharing!

### Getting started
**`git clone`** the project first, then we set up data, models and runs.

#### Datasets
We can provide the smaller datasets over email at milan.gritta@huawei.com.
PAWS-X is too big so download it [here](https://github.com/google-research-datasets/paws/tree/master/pawsx). If you don't want to wait, you can also download MTOP [here](https://fb.me/mtop_dataset), MTOD [here](https://fb.me/multilingual_task_oriented_data) and MultiATIS++ [here](https://github.com/amazon-research/multiatis) right away.

The directory structure is as follows. Create a **`data/`** folder first. For each task, create the following subdirectories: **`m_atis`**, **`mtop`**, **`paws_x`** and **`xnlu`**. Now save the downloaded data into each task's directory, creating a subdirectory for each language. 

You are now ready to run the preprocessing code in **`preprocess.py`** :) That will generate the required files and subdirectories. Now, inside the **`data`** folder, there should be four task folders, each with multiple languages/subfolders with the generated files/folders. That should be that as far as data preparation is concerned.

#### Pretrained Transformers
You will need to download the XLM-R (or other) pretrained model(s). We recommend **HuggingFace** :) The base XLM-R can be downloaded [here](https://huggingface.co/xlm-roberta-base/tree/main) and the large model [here](https://huggingface.co/xlm-roberta-large/tree/main). Save these models _outside_ the project directory (same place as **`data/`**) as **`xlm-roberta-base/`** and **`xlm-roberta-large/`**. Each one should contain: **`config.json`**, **`pytorch_model.bin`** and **`sentencepiece.bpe.model`**. Tha should be that for models! 

#### Python Environment
Everything is written in Python 3.7.9 and PyTorch 1.7.0 so install the following packages:
- Install [miniconda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) for your OS, probably Linux or MacOSX
- Install [PyTorch](https://pytorch.org/get-started/locally/) with conda using something like `conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch`, using the appropriate cudatoolkit...
- `transformers` version 3.5.0 from HuggingFace (or later) using pip
- The previous libraries should install all other dependencies, no more extra packages should be required

#### Running Experiments

In the **`config`** folder, we saved most setups as shell files that were reported in the paper (though not all of them because we reported lots of numbers/tables). That should get you to reproduce our runs. Here is an example:

Open the command line and type: **`nohup ./config/mtop.sh mtop_aligned mtop &`** this command will run the **`mtop_aligned`** experiments with XLM-R Large. The base model can be launched by using **`'... base_mtop &'`** instead.

The command **`nohup ./config/paws_x.sh paws_x_english base_paws_x &`** will train the base XLM-R on PAWS-X English, for example.

Once you trained an English model for MultiATIS++, for instance, you can type: **`nohup ./config/m_atis.sh m_atis_zero_shot base_m_atis &`**. This will give you the baseline zero-shot scores for M-ATIS++ for XLM-R base.

Finally, **`nohup ./config/xnlu.sh xnlu_target xnlu &`** should train the large XLM-R on the labelled data, referred to as 'Target' in the paper.

That should give a good idea for further runs, if unsure, look inside the shell file for clues :)
