Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization

Vishal Dey; Xiao Hu; Xia Ning

doi:10.18653/v1/2025.findings-emnlp.1145

Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization

Abstract

In real-world drug design, molecule optimization requires selectively improving multiple molecular properties up to pharmaceutically relevant levels, while maintaining others that already meet such criteria. However, existing computational approaches and instruction-tuned LLMs fail to capture such nuanced property-specific objectives, limiting their practical applicability. To address this, we introduce C-MuMOInstruct, the first instruction-tuning dataset focused on multi-property optimization with explicit, property-specific objectives. Leveraging C-MuMOInstruct, we develop \mathtt{GeLLM^4O\text{-}C}s, a series of instruction-tuned LLMs that can perform targeted property-specific optimization. Our experiments across 5 in-distribution and 5 out-of-distribution tasks show that \mathtt{GeLLM^4O\text{-}C}s consistently outperform strong baselines, achieving up to 126% higher success rate. Notably, \mathtt{GeLLM^4O\text{-}C}s exhibit impressive 0-shot generalization to novel optimization tasks and unseen instructions. This offers a step toward a foundational LLM to support realistic, diverse optimizations with property-specific objectives. C-MuMOInstruct and code are accessible through https://github.com/ninglab/GeLLMO-C.

Anthology ID:: 2025.findings-emnlp.1145
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20996–21023
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1145/
DOI:: 10.18653/v1/2025.findings-emnlp.1145
Bibkey:
Cite (ACL):: Vishal Dey, Xiao Hu, and Xia Ning. 2025. Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 20996–21023, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization (Dey et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1145.pdf
Checklist:: 2025.findings-emnlp.1145.checklist.pdf

PDF Cite Search Checklist Fix data