# Installation

The code is based on Fairseq (https://github.com/pytorch/fairseq). Please install Pytorch (>=1.2) with CUDA support, and then simply run "pip install --editable setup.py". Further instructions can be found on Fairseq official page.

# Dataset and Pre-Processing

The Grammarly Yahoo Corpus Dataset (GYAFC) is available on request (https://github.com/raosudha89/GYAFC-corpus). Please download it and place it in the root directory. 

Run "preprocess.sh" to preprocess the dataset. We followed the same instructions as Fairseq's BART model (https://github.com/pytorch/fairseq/blob/master/examples/bart/README.summarization.md). Please download pretrained "bart.large" model from this link.

# Training

For training, simply run "pipeline.sh". The details of all parameters are given in "fairseq/options.py". For details on parameters and values, please refer to the paper and appendix.

# Evaluation and Outputs

For generation, run "python evaluation/gen.py". Some folder paths may need to be changed depending on configuration. For evaluation and BLEU scores, run "python evaluation/calc\_score.py path\_to\_output\_file".

The outputs for our and various other models are given in evaluation/outputs. As mentioned in Table 4 of text, we provide outputs for Hybrid Annotations, Pretrained w/ rules, Ours and Target. "\_family" refers to F&R Domain and "\_music" refers to E&M Domain.
