# Hierarchical Transformers Are More Efficient Language Models - Code Appendix

## Model

The model is implemented in `./trax/models/research/hourglass.py`.

## Setup

The configuration below was tested with Python `3.8.10`.

To install all the reqiured dependencies, run the following script from the
root directory of this repository:
```
./install_deps.sh
```

## Experiments

Configuration for each of the main results is available through a `gin` file 
in `./experiments` directory.

To run an experiment, use this command (example for enwik8):

```
. venv/bin/activate
python3 -m trax.trainer --config_file ./experiments/hourglass_enwik8.gin --output_dir ./trainings
```

Note that these experiments require GPU/TPU accelerators.

## Shorten factor dropout

For shorten factor dropout experiments go to `sf_dropout/` directory where we provide a PyTorch reproduction
of our paper based on Transformer-XL codebase (see README.md in that directory for relevant instructions).

