## Results

- Training stage (**without** Task-specific Training Data)

  - `BERT-base + Translated ReLU`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 72.51 | 75.46 | 72.34 | 78.46 | 72.64 |    76.54     |      72.02      | 74.28 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

  - `BERT-base + Smooth K2 Loss`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 72.39 | 78.33 | 75.28 | 80.26 | 74.52 |    78.78     |      72.65      | 76.03 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

  - `RoBERTa-base + Translated ReLU`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 71.13 | 76.07 | 72.18 | 78.13 | 73.94 |    77.59     |      70.94      | 74.28 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

  - `RoBERTa-base + Smooth K2 Loss`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 72.53 | 78.28 | 73.88 | 80.88 | 75.35 |    77.44     |      73.94      | 76.04 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

- Fine-tuning stage (**with** Task-specific Training Data)

  - `BERT-base + Smooth K2 Loss`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 73.68 | 88.42 | 86.10 | 86.56 | 79.63 |    84.12     |      82.01      | 82.93 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

  - `RoBERTa-base + Smooth K2 Loss`

    ```bash
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness |  Avg. |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    | 73.83 | 89.00 | 84.16 | 87.95 | 81.94 |    84.64     |      81.07      | 83.23 |
    +-------+-------+-------+-------+-------+--------------+-----------------+-------+
    ```

## Data

- Training data: `nli_012.csv`
- Fine-tuning data: `sts-sick-round.jsonl`
- Link: https://drive.google.com/drive/folders/1ZL8gMOsxn6JFsH7h0vSSb1PjwFrdO-LQ?usp=sharing

## Checkpoints

- Link: https://drive.google.com/drive/folders/1-te5DByiDpcKd-AxZPmCEgkRqnaTvzmk?usp=sharing

  ```python
  tlu: TranslatedReLU
  sk2: SmoothK2Loss
  ```

## Setup

- Python: 3.9.16

```bash
pip install -r requirements.txt
nohup torchrun --nproc_per_node=4 train.py > nohup.out &
nohup torchrun --nproc_per_node=4 tune.py > nohup.out &
```
