GeAR: Generation Augmented Retrieval
Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang
Abstract
Document retrieval techniques are essential for developing large-scale information systems. The common approach involves using a bi-encoder to compute the semantic similarity between a query and documents. However, the scalar similarity often fail to reflect enough information, hindering the interpretation of retrieval results. In addition, this process primarily focuses on global semantics, overlooking the finer-grained semantic relationships between the query and the document’s content. In this paper, we introduce a novel method, Generation Augmented Retrieval (GeAR), which not only improves the global document-query similarity through contrastive learning, but also integrates well-designed fusion and decoding modules. This enables GeAR to generate relevant context within the documents based on a given query, facilitating learning to retrieve local fine-grained information.Furthermore, when used as a retriever, GeAR does not incur any additional computational cost over bi-encoders. GeAR exhibits competitive retrieval performance across diverse scenarios and tasks. Moreover, qualitative analysis and the results generated by GeAR provide novel insights into the interpretation of retrieval results. The code, data, and models will be released at https://github.com/microsoft/LMOps.- Anthology ID:
- 2025.findings-acl.166
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3193–3207
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.166/
- DOI:
- Cite (ACL):
- Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, and Qi Zhang. 2025. GeAR: Generation Augmented Retrieval. In Findings of the Association for Computational Linguistics: ACL 2025, pages 3193–3207, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- GeAR: Generation Augmented Retrieval (Liu et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.166.pdf