# PELT

All the code, data and model of this paper will be made publicly available.

## Requiremnt
```
pip install torch==1.6.0
pip install numpy
pip install ./transformers
```
After that, install [apex](https://github.com/NVIDIA/apex) for fp16 support.


## Generate entity embedding from text
Codes are in GenerateEmbed folder.

Randomly select at most 256 sentences for each entities:
```
python3 generate_evalsamples.py
```


Obtain the output representation of masked token in entity's occurrences
```
CUDA_VISIBLE_DEVICES=0 python3 run_generate.py --model_type roberta --model_name_or_path ../../bert_models/roberta-base     --data_dir ./heuristic_merge_roberta  --per_gpu_eval_batch_size 256  --fp16  --datasetname wiki
```

Aggregate the output representation to generate the embedding of entity:
```
python3 generate_embed.py
```
## Relation Extraction


#### Wiki80
Codes are in the RE folder. Training and Evaluation:

```
CUDA_VISIBLE_DEVICES=0 python3 run_wiki80_marker.py     --model_type robertamarker  --model_name_or_path ../bert_models/roberta-base-unk4     --do_train     --do_eval     --data_dir peng_data/wiki80   --max_seq_length 128     --per_gpu_eval_batch_size 64       --per_gpu_train_batch_size 32 --gradient_accumulation_steps 1   --learning_rate 3e-5     --save_steps 1000 --num_train_epochs 5   --evaluate_during_training --overwrite_output_dir   --entity_K 2 --fp16     --output_dir wiki80_models/test      --seed 42  
```


### FewRel

Codes are in the FewRel folder. Data can be found in [FewRel github repo](https://github.com/thunlp/FewRel).

Training and Evaluation:
```
CUDA_VISIBLE_DEVICES=0 python train_demo.py  --trainN  5 --N  5 --K 1 --Q 1 --model proto    --encoder luke --pretrain_ckpt ../bert_models/roberta-base-unk4  --hidden_size 768  --batch_size 32 --fp16   --grad_iter  1 --train_iter 1500 --val_step 250    --cat_entity_rep  --val val_pubmed --test val_pubmed   
```
where --dot flag means using inner product to measure distance instance of L2 in Proto. 


## LAMA
Codes are in the LAMA folder. 

Data can be download from [LAMA github repo](https://github.com/facebookresearch/LAMA).
```
CUDA_VISIBLE_DEVICES=0 python3 run_experiemnt.py 0
```
