# Implementation for EMNLP Submission: HyQE: Ranking Contexts with Hypothetical Query Embeddings

This repository contains the code base of the long paper submission `HyQE: Ranking Contexts with Hypothetical Query Embeddings`

## Installation
* Run `sh install.sh` (**check the content of `install.sh` beforehand**)
* Create a conda environment
* Install `poetry` by `pip install poetry`
* Install dependencies by `poetry install`
* Install `Faiss` by `conda install -c pytorch -c nvidia faiss-gpu=1.8.0` (Check details in [tutorial](https://github.com/facebookresearch/faiss/blob/main/INSTALL.md))
* Download the shared file from [link](https://drive.google.com/file/d/11enMG6c7nEbwUHcyyYHNJGzos2yoKcwt/view?usp=sharing). Unzip the file, move the file to [`hyqe/hyqe/src`](hyqe/hyqe/src) and change the filename into `.cache`

### To add new dependencies
* Under repo directory, run `poetry add the_dependency` to fetch nonlocal dependency or `poetry add the_path_to_the_dependency --editable` to install local dependency in development mode
* Run `poetry lock && poetry install`

## Structure
* [`hyqe/hyqe/src`](hyqe/hyqe/src) directory contains the source file of using hypothetical queries to improve RAG system. The implementation is based on [HyDE](https://github.com/texttron/hyde). 
* [`hyqe/test`](hyqe/test) directory contains the testing files that can be used as examples for running experiments.


## Run experiments
* Find [`hyqe/hyqe/run.sh`](hyqe/hyqe/run.sh).
* Edit the variables to set the arguments as explained in the comment
* Run `./run.sh` under [`hyqe/hyqe`](hyqe/hyqe)
* Collect results in [`hyqe/hyqe/log.txt`](hyqe/hyqe/log.txt).
 
## Collected Datasets
* Prebuilt-indexes provided by `pyserini` can be found [here](https://github.com/castorini/pyserini/blob/master/docs/prebuilt-indexes.md)

 
