Junhao Zeng


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
SQUARE: Unsupervised Retrieval Adaptation via Synthetic Data
Jinsung Yoon | Junhao Zeng | Sercan O Arik
Findings of the Association for Computational Linguistics: EMNLP 2025

Pre-trained retrieval models often face challenges in zero-shot retrieval for knowledge-based question answering, as different tasks rely on different corpora. We introduce SQUARE (Synthetic QUery-based Adaptive REtrieval), a novel method for corpus-specific unsupervised retrieval customization. SQUARE leverages LLMs to generate grounded synthetic question-answer pairs from the corpus, which are then used to fine-tune the retriever. A filtering mechanism based on the synthetic answers is employed to ensure high quality of tuning data. Extensive experiments on various datasets demonstrate superior performance of SQUARE compared to zero-shot retrieval and other customization methods, highlighting the value of corpus adaptation for effective retrieval.