polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design

Anagha Savit, Harikrishna Sahu, Shivank S. Shukla, Wei Xiong, Rampi Ramprasad


Abstract
Designing polymers for targeted applications and accurately predicting their properties is a key challenge in materials science owing to the vast and complex polymer chemical space. While molecular language models have proven effective in solving analogous problems for molecular discovery, similar advancements for polymers are limited. To address this gap, we propose polyBART, a language model-driven polymer discovery capability that enables rapid and accurate exploration of the polymer design space. Central to our approach is Pseudo-polymer SELFIES (PSELFIES), a novel representation that allows for the transfer of molecular language models to the polymer space. polyBART is, to the best of our knowledge, the first language model capable of bidirectional translation between polymer structures and properties, achieving state-of-the-art results in property prediction and design of novel polymers for electrostatic energy storage. Further, polyBART is validated through a combination of both computational and laboratory experiments. We report what we believe is the first successful synthesis and validation of a polymer designed by a language model, predicted to exhibit high thermal degradation temperature and confirmed by our laboratory measurements. Our work presents a generalizable strategy for adapting molecular language models to the polymer space and introduces a polymer foundation model, advancing generative polymer design that may be adapted for a variety of applications.
Anthology ID:
2025.findings-emnlp.647
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12104–12119
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.647/
DOI:
10.18653/v1/2025.findings-emnlp.647
Bibkey:
Cite (ACL):
Anagha Savit, Harikrishna Sahu, Shivank S. Shukla, Wei Xiong, and Rampi Ramprasad. 2025. polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 12104–12119, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design (Savit et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.647.pdf
Checklist:
 2025.findings-emnlp.647.checklist.pdf