Rui Pu
2025
DSG-MCTS: A Dynamic Strategy-Guided Monte Carlo Tree Search for Diversified Reasoning in Large Language Models
Rui Ha
|
Chaozhuo Li
|
Rui Pu
|
Litian Zhang
|
Xi Zhang
|
Sen Su
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) have shown strong potential in complex reasoning tasks. However, as task complexity increases, their performance often degrades, resulting in hallucinations, errors, and logical inconsistencies. To enhance reasoning capabilities, Monte Carlo Tree Search (MCTS) has been introduced to guide the exploration of reasoning paths in a structured manner. Despite its advantages, traditional MCTS relies on fixed reasoning strategies, limiting the diversity of reasoning paths and the coverage of the solution space. To address these limitations, we propose Dynamic Strategy-Guided MCTS (DSG-MCTS), a novel framework that dynamically integrates multiple reasoning strategies, such as abductive and analogical reasoning, to expand the reasoning space. At the same time, DSG-MCTS enhances reasoning efficiency through a dynamic strategy selection mechanism that adapts to the task context. Experimental results on challenging reasoning benchmarks demonstrate that DSG-MCTS achieves improved accuracy and efficiency, outperforming existing state-of-the-art methods.
2024
BaitAttack: Alleviating Intention Shift in Jailbreak Attacks via Adaptive Bait Crafting
Rui Pu
|
Chaozhuo Li
|
Rui Ha
|
Litian Zhang
|
Lirong Qiu
|
Xi Zhang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Jailbreak attacks enable malicious queries to evade detection by LLMs. Existing attacks focus on meticulously constructing prompts to disguise harmful intentions. However, the incorporation of sophisticated disguising prompts may incur the challenge of “intention shift”. Intention shift occurs when the additional semantics within the prompt distract the LLMs, causing the responses to deviate significantly from the original harmful intentions. In this paper, we propose a novel component, “bait”, to alleviate the effects of intention shift. Bait comprises an initial response to the harmful query, prompting LLMs to rectify or supplement the knowledge within the bait. By furnishing rich semantics relevant to the query, the bait helps LLMs focus on the original intention. To conceal the harmful content within the bait, we further propose a novel attack paradigm, BaitAttack. BaitAttack adaptively generates necessary components to persuade targeted LLMs that they are engaging with a legitimate inquiry in a safe context. Our proposal is evaluated on a popular dataset, demonstrating state-of-the-art attack performance and an exceptional capability for mitigating intention shift. The implementation of BaitAttack is accessible at: https://anonymous.4open.science/r/BaitAttack-D1F5.
Search
Fix author
Co-authors
- Rui Ha 2
- Chaozhuo Li 2
- Litian Zhang 2
- Xi Zhang 2
- Lirong Qiu 1
- show all...
- Sen Su 1