Yi Duan

2026

Multi-step retrosynthetic planning is a fundamental challenge in organic chemistry, traditionally modeled as a combinatorial search problem guided by single-step prediction models. However, this search-centric paradigm often disconnects from the explicit chemical reasoning processes employed by human experts. In this paper, we propose R³ (Reinforced Reasoning Retrosynthesis), a novel framework that reformulates this task as end-to-end generative reasoning. Instead of traversing a search tree, R³ simulates the problem-solving logic of chemists to directly generate complete synthetic pathways. To achieve this, we initialize the model with domain knowledge and employ end-to-end Reinforcement Learning (RL) to optimize the entire planning policy. Experimental results on Retrobench show that R³ achieves a state-of-the-art Top-1 accuracy of 43.7%, demonstrating that generative reasoning offers a superior alternative to traditional search algorithms in solving complex retrosynthetic problems.

Co-authors

Yifei Wang 1

Lihao Wang 1

Lijun Wu 1

Hao Zhou 1

Venues

ACL1

Fix author