# Automatically Generated Definitions and their utility for Modeling Word Meaning
This is the official repository for our paper _Automatically Generated Definitions and their utility for Modeling Word Meaning_

## Table of Contents
- [Abstract](#abstract)
- [Getting Started](#getting-started)
- [Reproducing Results](#reproducing-results)
- [References](#references)

## Abstract
Even with unprecedented advancements in text generation, modeling lexical semantics is still a challenging task, often suffering from interpretability pitfalls. In this paper, we delve into the generation of dictionary-like sense definitions and explore their utility for modeling word meaning. We fine-tuned two Llama models and include an existing T5-based model in our evaluation. Firstly, we evaluate the quality of the generated definitions on existing benchmarks, setting new state-of-the-art results for the Definition Generation task. Next, we explore the use of definitions generated by our models as intermediate representations subsequently encoded as sentence embeddings. We evaluate this approach on lexical semantics tasks such as the Word-in-Context, Word Sense Induction, Lexical Semantic Change, setting new state-of-the-art results in all three tasks. 

## Getting Started
Our research leveraged The <a href="https://www.c3se.chalmers.se/about/Alvis/">Alvis</a> cluster, a national <a href="https://www.naiss.se/">NAISS</a> (National Artificial Intelligence and Supercomputing System) resource specifically designed for Artificial Intelligence and Machine Learning investigations. To facilitate result reproducibility, we offer <a href="https://slurm.schedmd.com/sbatch.html">sbatch</a> files that streamline the process. If you intend to use NAISS resources, simply edit these files to include your NAISS project ID and run them with `sbatch`. Alternatively, you can directly run them with `bash` in other computing environments. 

Before you begin, ensure you have met the following requirements:
- <img src="https://miro.medium.com/v2/resize:fit:1400/1*lSTuwS4exV_s__kcShxk8w.png" width="20" height="20"> Python 3.11.3
- <img src="https://cdn-images-1.medium.com/max/580/0*Kt5_0uGLlCFAgbt6.png" width="25" height="25"> Required Python packages (listed in `requirements.txt`)

If you are using a cluster, you can direcyly load Python 3.11 with:

```module load PyTorch/2.1.2-foss-2023a-CUDA-12.1.1```

To install the required packages, you can create a virtual environment and use pip:

```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

## Reproducing Results
NOTE: we discourage running our commands without thoroughly investigating their content. They can allocate many resources and consume a significant amount of time space and time. Please consider reviewing each script before execution. We highly recommend reviewing the script line by line to maintain full control over the experiments. 
You find our sbatch files in the `sbatch` folder and our Python code in the `src` folder. Feel free to contact us if you face any issues!

Feel free to contact us for any issues or questions. If you intend to use our models without re-training or performing our evaluation, we are happy to inform you that they will be released on Huggingface soon. 

NOTE: when not explicitly set, we used default parameters in each script

### Download data
Download data used in <a href="https://aclanthology.org/2023.acl-long.176/">previous work</a> . 
```bash 
sbatch sbatch/download.sh
```

### Fine-tuning
Fine-tune meta Llama models. This command will execute several fine-tuning with different parameters. 
```bash 
bash sbatch/finetuning_batch.sh
```

### Generation
Generate definitions for all the Definition Generation, Word-in-Context, Word Sense Induction, and Lexical Semantic Change tasks. These commands will execute several script. 
```bash 
bash sbatch/generation_batch.sh # DG llamadictionary
```
```bash 
bash sbatch/t5_generation_batch.sh # DG flan-t5
```
```bash 
bash sbatch/generation_batch_wic.sh # WiC llamadictionary and flan-t5
```
```bash 
bash sbatch/generation_batch_lsc.sh # WSI-LSC llamadictionary and flan-t5
```

##### Evaluation and results
Evaluate the use of definition embeddings for all the Definition Generation, Word-in-Context, Word Sense Induction, and Lexical Semantic Change tasks. These commands will execute several script. Multiple SBERT models are tested and different settings (e.g., different sentence length / short word removal for LSC-WSI).

```bash 
bash sbatch/evaluation_batch.sh # DG llamadictionary and flan-t5
```
```bash 
bash sbatch/wic_evaluation.sh # multiple sbert models
```
```bash 
bash sbatch/wsi-lsc_evaluation.sh # multiple sbert models
```

### Our generated definitions and results
To print the evaluation result we obtained on the Definition Generation task, run:
```python 
python src/print_dg_scores.py
```
To make the plot in our paper:
```python 
python src/lora_plot.py --qlora --metric bertscore
python src/lora_plot.py --qlora --metric nltk_bleu
python src/lora_plot.py --lora --metric bertscore
python src/lora_plot.py --lora --metric nltk_bleu
python src/wic_plot.py
python src/wsi_plot.py
python src/lsc_plot.py
```

Our results are available in the `research-output` folder, while the downloaded and processed datasets will be available (after running our script) in the `datasets`, `wic`, and `dwug_en` folders. Note that the: `*answers` folders contain generated definitions for each line in the considered datasets, `*evaluation` folderds contain the scores for each metric considered in the DG task.
