This repository contains software, data, and models for :
- retraining HateBERT using the RAL-E dataset
- the fine-tuned models for OffensEval 2019, AbusEval, and HatEval for the generic BERT model
- the fine-tuned models for OffensEval 2019, AbusEval, and HatEval for the HateBERT model
- the list of package and library requirements (requirements.txt) for setting up the enviroment and re-running the experiments

To further pre-training BERT with the RAL-E dataset you have to run the script "pre_train_command.sh"
The RAL-E dataset is stored in /data/RAL-E/ folder. The training and test files are stored in this folder.

The scripts to fine-tuned BERT and HateBERT are stored in /software/fine_tuning_script/ .
The scripts are distinguished per benchmark (abuseval+hateval and offenseval) data and pre-trained language model (BERT vs. HateBERT).

The benchamrk data can be obatined at the following links:
- OffensEval 2019: https://sites.google.com/site/offensevalsharedtask/olid
- AbusEval: https://github.com/tommasoc80/AbuseEval
- HatEval: https://github.com/msang/hateval 
