
## Installation

- Python >= 3.8, PyTorch
- pip install -r requirements.txt
- For baseline (in a separate environment): pip install -r baseline/requirements-baseline.txt


## Usage Examples

### Reproducing Figure 1
```python
import math
from translation_models.mbart_models import MbartScoringModel

forward_model = MbartScoringModel("facebook/mbart-large-50-one-to-many-mmt", src_lang="en_XX", tgt_lang="de_DE", device=0)

src = "Please exit the plane after landing."
tgt = "Bitte verlassen Sie das Flugzeug."

partial_sources = [
    "exit the plane after landing.",
    "Please exit after landing.",
    "Please exit the plane.",
]

scores = forward_model.score(
    source_sentences=[src] + partial_sources,
    hypothesis_sentences=(4 * [tgt]),
)
print(list(map(math.exp, scores)))
```

### Using CoverageDetector
```python
from contrastive_conditioning.coverage_detector import CoverageDetector
from translation_models.mbart_models import MbartScoringModel

detector = CoverageDetector(
    src_language="en",
    tgt_language="de",
    forward_evaluator=MbartScoringModel("facebook/mbart-large-50-one-to-many-mmt", src_lang="en_XX", tgt_lang="de_DE", device=0),
    backward_evaluator=MbartScoringModel("facebook/mbart-large-50-many-to-one-mmt", src_lang="de_DE", tgt_lang="en_XX", device=1),
)
result = detector.detect_errors(
    src="Please exit the plane after landing.",
    translation="Bitte verlassen Sie das Flugzeug.",
)
print(result)
```

## Further Usage

## Segment-Level Comparison to Gold Data
### Run contrastive conditioning
```shell
mkdir predictions
python -m evaluation.predict_mqm en-de
```

### Evaluate contrastive conditioning
```shell
python -m evaluation.evaluate_mqm en-de test
```

### Train the baseline
```shell
kiwi train qe_baseline/kiwi_config_en-de.yaml
```

### Evaluate the baseline
```shell
python -m evaluation.evaluate_mqm_baseline en-de test <checkpoint_path>
```

## Synthetic Data

### Generate synthetic data
- Download monolingual data (e.g. English text from http://data.statmt.org/news-crawl/)
- `python -m data_generation.generate en-de <dataset_name>`

### Run contrastive conditioning on synthetic data
```shell
python -m evaluation.predict_synthetic_testset en-de <dataset_name>
```

### Evaluate on synthetic test set
```shell
python -m evaluation.evaluate_on_synthetic_data en-de <dataset_name> <baseline_predicted_source_tags_labels_path> <gold_source_tags> <gold_target_tags>
```

## License

MIT License except for data directory
