# MatRank
We propose MatRank, which learns to re-rank the text retrieved for a given query by learning to predict the most relevant passage based on a latent preference matrix. 

Our ranker model is based on [Reranker](https://github.com/luyug/Reranker)

## Environment Requirements
See requirements.txt

## Datasets
We provide demo data:
Passage data is from [RocketQA](https://arxiv.org/pdf/2010.08191.pdf)
Document data is from [HDCT](https://dl.acm.org/doi/pdf/10.1145/3366423.3380258)

The [Datasets for Document and Passage Ranking Leadboards](https://microsoft.github.io/msmarco/Datasets) is also needed


## Run Baseline
Before Training models, we need replace the path in script:
- `huggingface_model_local_dir` choice a huggingface pretrained model
- `ms_passage_corpus_dir` passage corpus which include query, passage files
- `ms_document_corpus_dir` document corpus which include query, url, title, body

## Check Dev Score
```shell
python3 tools/cal_rank_score.py outputs/demo_passage/ dev_truth_path
```
