Boyu Gou
2025
Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Botao Yu
|
Frazier N. Baker
|
Ziru Chen
|
Garrett Herb
|
Boyu Gou
|
Daniel Adu-Ampratwum
|
Xia Ning
|
Huan Sun
Findings of the Association for Computational Linguistics: NAACL 2025
To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemAgent, an enhanced chemistry agent over ChemCrow, and conduct a comprehensive evaluation of its performance on both specialized chemistry tasks and general chemistry questions. Surprisingly, ChemAgent does not consistently outperform its base LLMs without tools. Our error analysis with a chemistry expert suggests that: For specialized chemistry tasks, such as synthesis prediction, we should augment agents with specialized tools; however, for general chemistry questions like those in exams, agents’ ability to reason correctly with chemistry knowledge matters more, and tool augmentation does not always help.
2024
WebOlympus: An Open Platform for Web Agents on Live Websites
Boyuan Zheng
|
Boyu Gou
|
Scott Salisbury
|
Zheng Du
|
Huan Sun
|
Yu Su
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Web agents are emerging as powerful tools capable of performing complex tasks across diverse web environments. The rapid development of large multimodal models is further enhancing this advancement. However, there is a lack of standardized and user-friendly tools for research and development, as well as experimental platforms on live websites. To address this challenge, we present WebOlympus, an open platform for web agents operating on live websites. WebOlympus offers a Chrome extension-based UI, enabling users without programming experience to easily utilize the platform. It allows users to run web agents with various designs using only a few lines of code or simple clicks on the Chrome extension. To ensure the trustworthiness of web agents, a safety monitor module that prevents harmful actions through human supervision or model-based control is incorporated. WebOlympus supports diverse applications, including annotation interfaces for web agent trajectories and data crawling.
Search
Fix data
Co-authors
- Huan Sun 2
- Daniel Adu-Ampratwum 1
- Frazier N. Baker 1
- Ziru Chen 1
- Zheng Du 1
- show all...