## Environment
- Python==3.8
- PyTorch==1.8.0
- transformers==3.1.0
- tensorboardx==2.4
- lxml==4.6.3
- beautifulsoup4==4.9.3
- bs4==0.0.1
- stanza==1.2
- sentencepiece==0.1.95
- ipdb==0.13.9

## Datasets

We support `ace05e`, `ace05ep`, and `ere`.

### Preprocessing
#### `ace05e`
1. Prepare data processed from [DyGIE++](https://github.com/dwadden/dygiepp#ace05-event)
2. Put the processed data into the folder `processed_data/ace05e_dygieppformat`
3. Run `./scripts/process_ace05e.sh`

#### `ace05ep`
1. Download ACE data from [LDC](https://catalog.ldc.upenn.edu/LDC2006T06)
2. Run `./scripts/process_ace05ep.sh`

#### `ere`
1. Download ERE English data from LDC, specifically, "LDC2015E29_DEFT_Rich_ERE_English_Training_Annotation_V2", "LDC2015E68_DEFT_Rich_ERE_English_Training_Annotation_R2_V2", "LDC2015E78_DEFT_Rich_ERE_Chinese_and_English_Parallel_Annotation_V2"
2. Collect all these data under a directory with such setup:
```
ERE
├── LDC2015E29_DEFT_Rich_ERE_English_Training_Annotation_V2
│     ├── data
│     ├── docs
│     └── ...
├── LDC2015E68_DEFT_Rich_ERE_English_Training_Annotation_R2_V2
│     ├── data
│     ├── docs
│     └── ...
└── LDC2015E78_DEFT_Rich_ERE_Chinese_and_English_Parallel_Annotation_V2
      ├── data
      ├── docs
      └── ...
```
3. Run `./scripts/process_ere.sh`

## Training

### EAE

Run `./scripts/train_GenEAE.sh` or use the following commands:

- Generate data for GenEAE
```Bash
python generate_data_GenEAE.py -c config/config_GenEAE_ace05e.json
```

- Train GenEAE

```Bash
python train_GenEAE.py -c config/config_GenEAE_ace05e.json
```

The model will be stored at `./output/GenEAE_ace05e/[timestamp]/best_model.mdl` in default.

### ED

Run `./scripts/train_GenED.sh` or use the following commands:

- Generate data for GenED
```Bash
python generate_data_GenED.py -c config/config_GenED_ace05e.json
```

- Train GenED
```Bash
python train_GenED.py -c config/config_GenED_ace05e.json
```

The model will be stored at `./output/GenED_ace05e/[timestamp]/best_model.mdl` in default.

### End2End

Run `./scripts/train_GenE2E.sh` or use the following commands:

- Generate data for GenE2E
```Bash
python generate_data_GenE2E.py -c config/config_GenE2E_ace05e.json
```

- Train GenE2E
```Bash
python train_GenE2E.py -c config/config_GenE2E_ace05e.json
```

The model will be stored at `./output/GenE2E_ace05e/[timestamp]/best_model.mdl` in default.

## Evaluation

- Evaluate on Event Argument Extraction task (given gold triggers)

```Bash
python eval_pipelineEE.py -ceae config/config_GenEAE_ace05e.json -eae [eae_model] -g
```

- Evaluate ED+EAE on Event Extraction task
```Bash
python eval_pipelineEE.py -ced config/config_GenED_ace05e.json -ceae config/config_GenEAE_ace05e.json -ed [ed_model] -eae [eae_model]
```

- Evaluate E2E on Event Extraction task
```Bash
python eval_end2endEE.py -c config/config_GenE2E_ace05ep.json -e [e2e_model]
```
