Shida Wei
2025
QAEncoder: Towards Aligned Representation Learning in Question Answering Systems
Zhengren Wang
|
Qinhan Yu
|
Shida Wei
|
Zhiyu Li
|
Feiyu Xiong
|
Xiaoxing Wang
|
Simin Niu
|
Hao Liang
|
Wentao Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Modern QA systems entail retrieval-augmented generation (RAG) for accurate and trustworthy responses. However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free approach to bridge this gap. Specifically, QAEncoder estimates the expectation of potential queries in the embedding space as a robust surrogate for the document embedding, and attaches document fingerprints to effectively distinguish these embeddings. Extensive experiments across diverse datasets, languages, and embedding models confirmed QAEncoder’s alignment capability, which offers a simple-yet-effective solution with zero additional index storage, retrieval latency, training costs, or catastrophic forgetting and hallucination issues. The repository is publicly available at https://github.com/IAAR-Shanghai/QAEncoder.
Search
Fix author
Co-authors
- Zhiyu Li 1
- Hao Liang 1
- Simin Niu 1
- Zhengren Wang 1
- Xiaoxing Wang 1
- show all...
Venues
- acl1