Xiangyu Wu
2026
MirrorCAPTCHA: Wild CAPTCHA, Wild Distribution, Wild Web-based Platform Meet Multimodal LLM Agents
Xiangyu Wu | Yuwei Hu | Tianyu Cui | Yueying Tian | Qing-Guo Chen | Zhao Xu | Weihua Luo | Kaifu Zhang | Yang Yang | Jianfeng Lu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiangyu Wu | Yuwei Hu | Tianyu Cui | Yueying Tian | Qing-Guo Chen | Zhao Xu | Weihua Luo | Kaifu Zhang | Yang Yang | Jianfeng Lu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The path to fully autonomous web agents is currently hindered by a critical bottleneck: their limited ability to handle CAPTCHA. Existing agent benchmarks largely ignore this practical challenge, failing to evaluate an agent’s real-world capacity to solve CAPTCHA. To bridge this gap, we conduct a comprehensive analysis of real-world CAPTCHA distributions and introduce MirrorCAPTCHA, a benchmark annotated with Weighted Pass Rate and a newly proposed metric Completion Degree. MirrorCAPTCHA is designed to serve as a “mirror” that faithfully reflects the automation capabilities of agents in real scenarios. We filter 2095 websites from Common Crawl, identify the CAPTCHA deployed on these sites, and cluster them into 18 distinct categories using K-means algorithm. To ensure practicality, we extract a web subgraph from Common Crawl covering these websites and use random walks to simulate real-world CAPTCHA encounter frequencies, yielding a realistic measure of agents’ ability. Additionally, we develop a lightweight synthetic data pipeline to train Ovis2-Agent-CAPTCHA-8B, which significantly outperforms current state-of-the-art closed-source models on MirrorCAPTCHA, achieving a 9.4% higher average Weighted Pass Rate and a 2.13% higher average Completion Degree than the runner-up, Gemini-2.5-Pro.
OneRec-Think: In-Text Reasoning for Generative Recommendation
Zhanyu Liu | Shiyao Wang | Xingmei Wang | Rongzhou Zhang | Jiaxin Deng | Honghui Bao | Jinghao Zhang | Wuchao Li | PengFei Zheng | Xiangyu Wu | Yifei Hu | Qigen Hu | Xinchen Luo | Lejian Ren | Zhang Zixing | Qianqian Wang | Kuo Cai | Yunfan Wu | Hongtao Cheng | Zexuan Cheng | Lu Ren | Huanjie Wang | Yi Su | Ruiming Tang | Kun Gai | Guorui Zhou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhanyu Liu | Shiyao Wang | Xingmei Wang | Rongzhou Zhang | Jiaxin Deng | Honghui Bao | Jinghao Zhang | Wuchao Li | PengFei Zheng | Xiangyu Wu | Yifei Hu | Qigen Hu | Xinchen Luo | Lejian Ren | Zhang Zixing | Qianqian Wang | Kuo Cai | Yunfan Wu | Hongtao Cheng | Zexuan Cheng | Lu Ren | Huanjie Wang | Yi Su | Ruiming Tang | Kun Gai | Guorui Zhou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The powerful generative capacity of Large Language Models (LLMs) has instigated a paradigm shift in recommendation. However, existing generative models (e.g., OneRec) operate as implicit predictors, critically lacking the capacity for explicit and controllable reasoning—a key advantage of LLMs. To bridge this gap, we propose OneRec-Think, a unified framework that seamlessly integrates dialogue, reasoning, and personalized recommendation. OneRec-Think incorporates: (1) Itemic Alignment: cross-modal Item-Textual Alignment for semantic grounding; (2) Reasoning Activation: Reasoning Scaffolding to activate LLM reasoning within the recommendation context; and (3) Reasoning Enhancement, where we design a recommendation-specific reward function that accounts for the multi-validity nature of user preferences. Experiments across public benchmarks show state-of-the-art performance. Moreover, our proposed "Think-Ahead" architecture enables effective industrial deployment, achieving a 0.159% gain in APP Stay Time and validating the practical efficacy of the model’s explicit reasoning capability.
Search
Fix author
Co-authors
- Honghui Bao 1
- Kuo Cai 1
- Qing-Guo Chen 1
- Hongtao Cheng 1
- Zexuan Cheng 1
- Tianyu Cui 1
- Jiaxin Deng 1
- Kun Gai 1
- Qigen Hu 1
- Yifei Hu 1
- Yuwei Hu 1
- Wuchao Li 1
- Zhanyu Liu 1
- Jianfeng Lu 1
- Weihua Luo 1
- Xinchen Luo 1
- Lejian Ren 1
- Lu Ren 1
- Yi Su 1
- Ruiming Tang 1
- Yueying Tian 1
- Huanjie Wang 1
- Qianqian Wang 1
- Shiyao Wang 1
- Xingmei Wang 1
- Yunfan Wu 1
- Zhao Xu 1
- Yang Yang 1
- Jinghao Zhang 1
- Kaifu Zhang 1
- Rongzhou Zhang 1
- PengFei Zheng 1
- Guorui Zhou 1
- Zhang Zixing 1
Venues
- ACL2