Zhenlin Su
2025
Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings
Liyan Xu
|
Zhenlin Su
|
Mo Yu
|
Jiangnan Li
|
Fandong Meng
|
Jie Zhou
Findings of the Association for Computational Linguistics: EMNLP 2025
This work stems from an observed limitation of text encoders: embeddings may not be able to recognize fine-grained entities or events within encoded semantics, resulting in failed retrieval even in simple cases. To examine such behaviors, we first introduce a new evaluation dataset, CapRetrieval, in which passages are image captions and queries are phrases targeting entity or event concepts in diverse forms. Zero-shot evaluation suggests that encoders often struggle with these fine-grained matching, regardless of training sources or model size. Aiming for enhancement, we proceed to finetune encoders with our proposed data generation strategies, enabling a small 0.1B encoder to outperform the state-of-the-art 7B model. Within this process, we further uncover the granularity dilemma, a challenge for embeddings to capture fine-grained salience while aligning with overall semantics. Our dataset, code and models in this work are publicly released at https://github.com/lxucs/CapRetrieval.
2024
Identifying Factual Inconsistencies in Summaries: Grounding LLM Inference via Task Taxonomy
Liyan Xu
|
Zhenlin Su
|
Mo Yu
|
Jin Xu
|
Jinho D. Choi
|
Jie Zhou
|
Fei Liu
Findings of the Association for Computational Linguistics: EMNLP 2024
Factual inconsistencies pose a significant hurdle for the faithful summarization by generative models. While a major direction to enhance inconsistency detection is to derive stronger Natural Language Inference (NLI) models, we propose an orthogonal aspect that underscores the importance of incorporating task-specific taxonomy into the inference. To this end, we consolidate key error types of inconsistent facts in summaries, and incorporate them to facilitate both the zero-shot and supervised paradigms of LLMs. Extensive experiments on ten datasets of five distinct domains suggest that, zero-shot LLM inference could benefit from the explicit solution space depicted by the error type taxonomy, and achieves state-of-the-art performance overall, surpassing specialized non-LLM baselines, as well as recent LLM baselines. We further distill models that fuse the taxonomy into parameters through our designed prompt completions and supervised training strategies, efficiently substituting state-of-the-art zero-shot inference with much larger LLMs.
Search
Fix author
Co-authors
- Liyan Xu 2
- Mo Yu 2
- Jie Zhou (周洁) 2
- Jinho D. Choi 1
- Jiangnan Li 1
- show all...