# CausalR: Causal Reasoning over Natural Language Rulebases


![Screenshot 2021-11-16 at 1 57 54 PM](https://user-images.githubusercontent.com/50492433/141948616-939fb5a8-dd20-4707-a997-a4ce49c14982.png)

Above figure shows an example of the reasoning task that we aim to solve using our modular reasoning model, CausalR.

![Screenshot 2021-11-16 at 3 28 04 PM](https://user-images.githubusercontent.com/50492433/141963705-fa1678d0-3583-4cce-b28d-de26ec4468a7.png)

Above figure depicts the overview of the CausalR architecture which comprises of 3 independent modules.

### Dependencies

- Dependencies can be installed using `requirements.txt`.

### Pipeline for running CausalR

- Download the Proofwriter dataset from https://aristo-data-public.s3.amazonaws.com/proofwriter/proofwriter-dataset-V2020.12.3.zip. This will download a zip file for you.
- In your current directory (directory of this repository/code), run the following commands to make relevand data directories and to copy data in those directories

```bash
mkdir ../data
mkdir ../data/raw/
unzip <path to downloaded proofwriter dataset zip file> -d ../data/raw/
# Now we have all unzipped data in ../data/raw/proofwriter-dataset-V2020.12.3
mv ../data/raw/proofwriter-dataset-V2020.12.3 ../data/raw/proofwriter
# After the above step we have the directory "../data/raw/proofwriter" having all the data
mkdir ../saved
```

- After setting up the data as described above, we can train our models, followed by evaluation using the following commands

```bash
# make data for training ruleselector
python process_proofwriter.py --dataset pwq_leq_0to3 --pw_model pw_rule --arch roberta_large --world_assump OWA

# train ruleselector model
python main.py --override proofwriter_ruleselector,pwq_leq_0to3_OWA_rule

# make data for training factselector
python process_proofwriter.py --dataset pwq_leq_0to3 --pw_model pw_fact --arch roberta_large --world_assump OWA

# train factselector model
python main.py --override proofwriter_factselector,pwq_leq_0to3_OWA_fact

# make data for training reasoner
python process_proofwriter.py --dataset pw_leq_0to3 --pw_model pw_reasoner --arch t5_large --world_assump OWA

# train reasoner model (default is T5-large)
python main.py --override proofwriter_reasoner --dataset pw_leq_3_OWA_reasoner

# make the inference data (using unstaged files obtained from ProofWriter paper)
python process_proofwriter.py --dataset pwu_leq_5 --world_assump OWA 
# Note: using the above command, the D5 data will be generated for evaluation

# evaluate the model checkpoints from above using the inference pipeline
python main.py --override proofwriter_inference,evaluate --dataset pwu_leq_5_OWA --ruleselector_ckpt <path_to_trained_checkpoint> --factselector_ckpt <path_to_trained_checkpoint> --reasoner_ckpt <path_to_trained_checkpoint>
```

The above commands would train and evaluate models using the D0-3 data and D5 data respectively. Similar to this, other commands can be changed for different settings. For example, if we need to train models using D3 data, replace "D0to3" with "D3" in the above commands.
