LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?

Rushil Gupta; Jason Hartford; Bang Liu

doi:10.18653/v1/2025.findings-emnlp.838

LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?

Abstract

Large language models (LLMs) have recently been proposed as general-purpose agents for experimental design, with claims that they can perform in-context experimental design. We evaluate this hypothesis using open-source instruction-tuned LLMs applied to genetic perturbation and molecular property discovery tasks. We find that LLM-based agents show no sensitivity to experimental feedback: replacing true outcomes with randomly permuted labels has no impact on performance. Across benchmarks, classical methods such as linear bandits and Gaussian process optimization consistently outperform LLM agents. We further propose a simple hybrid method, LLM-guided Nearest Neighbour (LLMNN) sampling, that combines LLM prior knowledge with nearest-neighbor sampling to guide the design of experiments. LLMNN achieves competitive or superior performance across domains without requiring significant in-context adaptation. These results suggest that current open-source LLMs do not perform in-context experimental design in practice and highlight the need for hybrid frameworks that decouple prior-based reasoning from batch acquisition with updated posteriors.

Anthology ID:: 2025.findings-emnlp.838
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15482–15510
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.838/
DOI:: 10.18653/v1/2025.findings-emnlp.838
Bibkey:
Cite (ACL):: Rushil Gupta, Jason Hartford, and Bang Liu. 2025. LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 15482–15510, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet? (Gupta et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.838.pdf
Checklist:: 2025.findings-emnlp.838.checklist.pdf

PDF Cite Search Checklist Fix data