[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
# Bias Mitigation in Machine Translation Quality Estimation

Machine Translation Quality Estimation (QE) aims to build predictive models to assess the quality of machine-generated translations in the absence of reference translations. While state-of-the-art QE models have been shown to achieve good results, they over-rely on features that do not have a causal impact on the quality of a translation. In particular, there appears to be a partial input bias, i.e., a tendency to assign high-quality scores to translations that are fluent and grammatically correct, even though they do not preserve the meaning of the source. We analyse the partial input bias in further detail and evaluate four approaches to use auxiliary tasks for bias mitigation. Two approaches use additional data to inform and support the main task, while the other two are adversarial, actively discouraging the model from learning the bias. We compare the methods with respect to their ability to reduce the partial input bias while maintaining the overall performance. We find that training a multitask architecture with an auxiliary binary classification task that utilises additional augmented data achieves the desired effects and generalises well to different languages and quality metrics.

This repository builds upon MonoTransQuest and includes all modifications and added functionality that were made as part of this research project. In particular, it includes MultiTransQuest, an alternative architecture that can be trained with multiple auxiliary tasks. We also provide four example files so that the best model for each proposed bias mitigation method can easily be re-trained. The original repository can be found [here](https://github.com/TharinduDR/TransQuest).



## Structure of the Repository

### CODE

The repository features two architectures for sentence-level translation quality estimation:

- **MonoTransQuest**, the original architecture proposed by Ranasinghe *et al*, including modifications to enable training with focal loss. All relevant files for the MonoTransQuest architecture are located in the folder *CODE/transquest/algo/sentence_level/monotransquest*.

- **MultiTransQuest**, a modified version of MonoTransQuest to allow training with auxiliary tasks in a multitask setup. All relevant files for the MultiTransQuest architecture are located in the folder *CODE/transquest/algo/sentence_level/multitransquest*.

In addition to the architectures, the folder *CODE/transquest/training* includes additional **utility functions for training**, including the functions written to execute the different experiments. Most importantly, it also holds the main config files  *monotransquest_config.py* and *multitransquest_config.py* used to control the hyperparameters.

Trained models are automatically saved to *CODE/temp*.


## Installation
1. We provide the necessary data required for the trial runs. To use other data resources, please adapt the data loaders.
2. Install the dependencies from the provided requirements.txt file
   `pip install -r requirements.txt`
3. In *CODE/transquest/training/monotransquest_config.py* and *CODE/transquest/training/multitransquest_config.py*, adapt the project directory variable `PROJECT_DIR` so that it matches the local location of the project.

## Examples
We provide four files that can be executed to reproduce the best models per experiment section. Navigate into the main project folder and run one of the following commands:

- `python3 run_multitask_mixed_languages.py`
- `python3 run_multitask_shuffled_data.py`
- `python3 run_multitask_adversarial.py`
- `python3 run_focalloss.py`

Each of the files contains a small selection of configurations that will overwrite the config settings. For an extensive list of tweakable parameters, please refer to the files *monotransquest_config.py* and *multitransquest_config.py* to adjust the settings for MonoTransQuest and MultiTransQuest, respectively.

The files can be executed via the command line.

**Important:** Please note that the training process involves loading and training XLM-R base. We recommend training the models on a GPU. Saving the models consumes up to 2 GB of storage.


## Citations

```bash
@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}
```

```bash
@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}
```
