RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models

Xufeng Liu, Yixuan Ding, Jingxiang Qu, Yichi Zhang, Wenhan Gao, Yi Liu


Abstract
Recent success of large language models (LLMs) in diverse domains showcases their potential to revolutionize scientific fields, including drug editing. Traditional drug editing relies on iterative conversations with domain experts, refining the drug until the desired property is achieved. This interactive and iterative process mirrors the strengths of LLMs, making them well-suited for drug editing. *In existing works, LLMs edit each molecule independently without leveraging knowledge from past edits.* However, human experts develop intuition about effective modifications over time through historical experience; accumulating past knowledge is pivotal for human experts, and so it is for LLMs. *In this work, we propose RL-Guider — a reinforcement-learning agent to provide suggestions to LLMs; it uses the rich information provided from evaluating editing results made by the LLM based on the recommendations to improve itself over time.* RL-Guider is the first work that leverages both the comprehensive “world-level” knowledge of LLMs and the knowledge accumulated from historical feedback. As a result, RL-Guider mitigates several shortcomings of existing approaches and demonstrates superior performance. The code is available at [https://github.com/xufliu/RL-Guider](https://github.com/xufliu/RL-Guider).
Anthology ID:
2025.findings-acl.680
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13121–13138
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.680/
DOI:
Bibkey:
Cite (ACL):
Xufeng Liu, Yixuan Ding, Jingxiang Qu, Yichi Zhang, Wenhan Gao, and Yi Liu. 2025. RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 13121–13138, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models (Liu et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.680.pdf