Tianle Zhang


2026

While LLM watermarking is essential for machine- generated content identification, existing paraphrase-based attacks struggle to balance watermark removal efficacy with text quality. We propose TSAPA, a training-free evolutionary framework that models watermark removal as a constrained multi-objective optimization problem. By leveraging genetic algorithms to navigate the Pareto front, TSAPA utilizes a Pseudo-Log-Likelihood (PLL)-guided mutation to precisely target and modify watermark-carrying tokens. Experiments on Qwen3 series (1.7B/8B/32B) across multiple watermark schemes show that TSAPA achieves over 90% attack success rate (ASR) while maintaining high text semantic fidelity, significantly outperforming baselines methods. This work exposes critical vulnerabilities in current watermarks and provides a new perspective for robust evaluation.
Multimodal Large Language Models (MLLMs) frequently hallucinate due to their reliance on fragile, linear reasoning and weak visual grounding. We propose Visual Attention Reasoning (VAR), a reinforcement learning framework that reformulates reasoning as a hierarchical search with self-verification. VAR enforces traceable evidence grounding by generating explicit bounding boxes, guided by a novel reward function combining geometric precision and semantic sufficiency. Furthermore, it replaces linear Chain-of-Thought with a tree-search policy capable of backtracking to correct logical errors. Theoretical analysis validates the framework’s reliability, and extensive experiments demonstrate that VAR significantly outperforms state-of-the-art methods on complex hallucination and safety benchmarks.