Lan Zhang

Papers on this page may belong to the following people: Lan Zhang

2026

How Do LLMs "Trust" Unknown Knowledge? An Unknown Knowledge Based Jailbreak Attack
Yixiao Huang | Lan Zhang | Chaoran Wang
Findings of the Association for Computational Linguistics: ACL 2026

Learning unknown knowledge through ICL and RAG can enhance LLM capabilities in specialized fields. While most research focuses on how to identify and utilize such knowledge, little work examines what factors lead LLMs to trust and adopt it, leaving models prone to errors and harmful content. Grounded in extensive pre-experiments, we design five pairs of trust-enhancing and trust-diminishing transformations on unknown knowledge to experimentally identify the key trust factors. These findings are further substantiated through a detailed theoretical analysis grounded in the epistemological framework of evidentialism. Based on these insights, we challengingly propose a completely unrestricted and fully randomized jailbreak attack that embeds malicious queries within trust-enhanced unknown knowledge. In both defended and undefended scenarios, our method achieves 99% to 100% ASR on all tested LLMs, including the latest GPT-5.1, and becomes SOTA. This attack confirms the trust mechanism and exposes a critical and hard-to-defend security risk. Our conclusions provide valuable guidance for understanding trust mechanism of unknown knowledge and for future research.

pdf bib abs

VET: Verifiable Execution Tracing for Reliable Text-to-SQL Generation
Dongyu Wang | Jingyu Li | Lan Zhang | Ganggang.yu | Liang Huang
Findings of the Association for Computational Linguistics: ACL 2026

Large language models (LLMs) have shown remarkable capabilities in text-to-SQL generation, yet existing approaches remain prone to hallucinations and lack verification mechanisms. Current methods such as Chain-of-Thought (CoT) and Program-of-Thought (PoT) typically rely on intermediate reasoning that is either purely textual or executed only as a final step, leaving the reasoning process opaque and prone to grounding and logical hallucinations. In this paper, we introduce Verifiable Execution Tracing (VET), a novel reasoning paradigm that transforms text-to-SQL from unverifiable textual rationales into step-wise executable semantics. VET addresses these limitations by constraining the reasoning process within a candidate schema space and formulating it as a sequence of executable Python steps. Crucially, each step is executed against the real database to produce observable intermediate results, which serve as immediate verification feedback and transform the traditionally opaque generation process into a transparent, debuggable interaction with database reality.Experiments show consistent gains under matched, training-free settings, achieving 70.93% execution accuracy on BIRD and 37.04% on Spider 2.0-lite, with particularly strong improvements on complex queries.

2025

pdf bib abs

RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
Yihang Cheng | Lan Zhang | Junyang Wang | Mu Yuan | Yunhao Yao
Findings of the Association for Computational Linguistics: ACL 2025

Retrieval-augmented generation (RAG) improves the service quality of large language models by retrieving relevant documents from credible literature and integrating them into the context of the user query.Recently, the rise of the cloud RAG service has made it possible for users to query relevant documents conveniently.However, directly sending queries to the cloud brings potential privacy leakage.In this paper, we are the first to formally define the privacy-preserving cloud RAG service to protect the user query and propose RemoteRAG as a solution regarding privacy, efficiency, and accuracy.For privacy, we introduce (n,𝜖)-DistanceDP to characterize privacy leakage of the user query and the leakage inferred from relevant documents.For efficiency, we limit the search range from the total documents to a small number of selected documents related to a perturbed embedding generated from (n,𝜖)-DistanceDP, so that computation and communication costs required for privacy protection significantly decrease.For accuracy, we ensure that the small range includes target documents related to the user query with detailed theoretical analysis.Experimental results also demonstrate that RemoteRAG can resist existing embedding inversion attack methods while achieving no loss in retrieval under various settings.Moreover, RemoteRAG is efficient, incurring only 0.67 seconds and 46.66KB of data transmission (2.72 hours and 1.43 GB with the non-optimized privacy-preserving scheme) when retrieving from a total of 10⁵ documents.

Co-authors

Mu Yuan 1

Venues

Findings3

Fix author