# ImRL
> Source code for paper "Implicit Relation Linking for Question Answering over Knowledge Graph". 

## Code

All the code files are in `/code/` folder, please see those files for more details.

## Dataset

The datasets used in this paper include QALD, LC-QuAD and PathSQ. Please see them in `/data&result/` folder.

## Result
Please see the results in `/data&result/` folder. The results predicted by ImRL are given in three dataset files, respectively.

### Dependencies
* Python 3
* Pytorch 1.x
* Numpy
* Gensim
* Requests
* Stanza
* SPARQLWrapper
* Transformers
* DGL-KE (Used for the training of knowledge graph embeddings)

> Please see requirements.txt for more information about the version of each package.

### Running
1. Place all the files that need to be downloaded or trained as indicated in the table below previously.
3. Use the data to train the path ranker model in `Trainer.py`, and store the trained model.
4. Run `Runner.py` to predict the results and use `10-fold-train.py`  in script fold for evaluation.

```
ROOT/
├─ DATA_PATH/
    ├── KGE/ckpts/RotatE/ # Knowledge graph embeddings trained by DGL-KE, need to be trained previously.
    	├── config.json
    	├── entity.npy
    	├── relation.npy
    ├── ImRL/
    	├── data
    		├── dictionary
	    		├── Paraphrase.txt # Need to be downloaded from Github.
             ├── glove.6B.200d.txt # Need to be downloaded from web page.
    	├── evaluation
    		├── QALD
    			├── train.json
    			├── test.json
    			├── ...
    		├── LC-QuAD
    		├── PathSQ
    ├── stanza_resources/ # Files for Stanza, need to be downloaded previously.
├─ code/
	├─ model
		├─ IODataset.py
		├─ PathRanker.py
		├─ Runner.py
		├─ TestDataset.py
		├─ Trainer.py
	├─ script
		├─ train_kge.sh
	├─ NodeIdentifier.py
	├─ SubgraphConstructor.py
	├─ CandidateGenerator.py
	├─ requirements.txt 
```