ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

Kai Hui; Honglei Zhuang; Tao Chen; Zhen Qin; Jing Lu; Dara Bahri; Ji Ma; Jai Gupta; Cicero dos Santos; Yi Tay; Donald Metzler

doi:10.18653/v1/2022.findings-acl.295

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Gupta, Cicero Nogueira dos Santos, Yi Tay, Donald Metzler

Abstract

State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper proposes a new training and inference paradigm for re-ranking. We propose to finetune a pretrained encoder-decoder model using in the form of document to query generation. Subsequently, we show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference. This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference. Our experiments show that this new paradigm achieves results that are comparable to the more expensive cross-attention ranking approaches while being up to 6.8X faster. We believe this work paves the way for more efficient neural rankers that leverage large pretrained models.

Anthology ID:: 2022.findings-acl.295
Volume:: Findings of the Association for Computational Linguistics: ACL 2022
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3747–3758
Language:
URL:: https://aclanthology.org/2022.findings-acl.295
DOI:: 10.18653/v1/2022.findings-acl.295
Bibkey:
Cite (ACL):: Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Gupta, Cicero Nogueira dos Santos, Yi Tay, and Donald Metzler. 2022. ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3747–3758, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference (Hui et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/improve-issue-templates/2022.findings-acl.295.pdf
Data: MS MARCO, Natural Questions

PDF Search