Yuyang Ye
2026
Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression
Jingyu Peng | Maolin Wang | Nan Wang | Jiatong Li | Yuchen Li | Yuyang Ye | Wanyu Wang | Pengyue Jia | Kai Zhang | Xiangyu Zhao
Findings of the Association for Computational Linguistics: ACL 2026
Jingyu Peng | Maolin Wang | Nan Wang | Jiatong Li | Yuchen Li | Yuyang Ye | Wanyu Wang | Pengyue Jia | Kai Zhang | Xiangyu Zhao
Findings of the Association for Computational Linguistics: ACL 2026
Despite substantial advancements in aligning LLMs with human values, current safety mechanisms remain susceptible to jailbreak attacks. We attribute this vulnerability to the distributional discrepancies between alignment-oriented prompts and malicious prompts. To investigate this, and drawing inspiration from logic-driven NLP tasks, we introduce LogiBreak, a universal black-box jailbreak method that utilizes logical expression translation to bypass LLM safety mechanisms. By converting harmful natural language prompts into formal logical expressions, LogiBreak exploits the distributional gap between alignment data and logic-expressed inputs, preserving the underlying semantic intent and readability while evading safety constraints. Furthermore, to fill the gap of existing benchmarks that lack systematic resources specifically targeting logical expression-based attacks against LLM robustness, we construct a novel multilingual logical expression jailbreak dataset for evaluation. Our evaluations of LogiBreak in five languages demonstrate its effectiveness and generalizability in various linguistic contexts. The code is available at https://github.com/Applied-Machine-Learning-Lab/ACL2026_Logibreak.
2025
UniRAG: Unified Query Understanding Method for Retrieval Augmented Generation
Rui Li | Liyang He | Qi Liu | Zheng Zhang | Heng Yu | Yuyang Ye | Linbo Zhu | Yu Su
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Rui Li | Liyang He | Qi Liu | Zheng Zhang | Heng Yu | Yuyang Ye | Linbo Zhu | Yu Su
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-Augmented Generation (RAG) technology effectively addresses the issues of knowledge update lag and hallucinations in large language models (LLMs) by integrating internal and external knowledge. Existing query augmentation methods improve RAG’s performance in handling complex queries but face two key challenges: (1) the separation of query augmentation and encoding tasks, which hinders information sharing and introduces cumulative errors, and (2) the difficulty of selecting the optimal augmentation strategy for different scenarios. In this work, we propose UniRAG, a unified framework for query understanding in RAG. UniRAG employs a decoder-only LLM to jointly perform query augmentation and encoding, eliminating task separation. To facilitate adaptive query augmentation, we categorize existing techniques into query paraphrasing, query expansion, and query abstraction. Our model learns to select the optimal augmentation strategy based on user queries, leveraging retrieval and generation outputs as feedback. Experimental results show that UniRAG significantly outperforms traditional query augmentation methods in five knowledge-intensive benchmark tasks in both closed and open domain question answering.