HoneyComb: A Flexible LLM-Based Agent System for Materials Science

Huan Zhang, Yu Song, Ziyu Hou, Santiago Miret, Bang Liu


Abstract
The emergence of specialized large language models (LLMs) has shown promise in addressing complex tasks in materials science. Many LLMs, however, often struggle with the distinct complexities of materials science tasks, such as computational challenges, and rely heavily on outdated implicit knowledge, leading to inaccuracies and hallucinations. To address these challenges, we introduce HoneyComb, the first LLM-based agent system specifically designed for materials science. HoneyComb leverages a reliable, high-quality materials science knowledge base (MatSciKB) and a sophisticated tool hub (ToolHub) tailored specifically for materials science to enhance its reasoning and computational capabilities. MatSciKB is a curated, structured knowledge collection based on reliable literature, while ToolHub employs an Inductive Tool Construction method to generate, decompose, and refine API tools for materials science. Additionally, HoneyComb leverages a retriever module that adaptively selects the appropriate knowledge source or tools for specific tasks, thereby ensuring accuracy and relevance. Our results demonstrate that HoneyComb significantly outperforms baseline models across various tasks in materials science, effectively bridging the gap between current LLM capabilities and the specialized needs of this domain. Furthermore, our adaptable framework can be easily extended to other scientific domains, highlighting its potential for broad applicability in advancing scientific research and applications.
Anthology ID:
2024.findings-emnlp.192
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3369–3382
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.192/
DOI:
10.18653/v1/2024.findings-emnlp.192
Bibkey:
Cite (ACL):
Huan Zhang, Yu Song, Ziyu Hou, Santiago Miret, and Bang Liu. 2024. HoneyComb: A Flexible LLM-Based Agent System for Materials Science. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3369–3382, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
HoneyComb: A Flexible LLM-Based Agent System for Materials Science (Zhang et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.192.pdf