Elvis Hsieh
2025
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval
Yuan Chiang
|
Elvis Hsieh
|
Chia-Hong Chou
|
Janosh Riebesell
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Materials science research requires multi-step reasoning and precise material informatics retrieval, where minor errors can propagate into significant failures in downstream experiments. Despite their general success, Large Language Models (LLMs) often struggle with hallucinations, handling domain-specific data effectively (e.g., crystal structures), and integrating experimental workflows. To address these challenges, we introduce LLaMP, a hierarchical multi-agent framework designed to emulate the materials science research workflow. The high-level supervisor agent decomposes user requests into sub-tasks and coordinates with specialized assistant agents. These assistant agents handle domain-specific tasks, such as retrieving and processing data from the Materials Project (MP) or conducting simulations as needed. This pipeline facilitates iterative refinement of material property retrieval and enables the simulation of real-world research workflows. To ensure reliability, we propose a novel metric combining uncertainty and confidence estimate to evaluate the self-consistency of responses from LLaMP and baseline methods. Our experiments demonstrate LLaMP’s superior performance in material property retrieval, crystal structure editing, and annealing molecular dynamics simulations using pre-trained interatomic potentials. Unlike prior work focused solely on material property prediction or discovery, LLaMP serves as a foundation for autonomous materials research by combining grounded informatics and enabling iterative experimental processes. Code and live demo are available at https://github.com/chiang-yuan/llamp.
2024
Reasoning and Tools for Human-Level Forecasting
Elvis Hsieh
|
Preston Fu
|
Jonathan Chen
Proceedings of the Workshop on the Future of Event Detection (FuturED)
enter abstract here