PRACT: Optimizing Principled Reasoning and Acting of LLM Agent

Zhiwei Liu; Weiran Yao; Jianguo Zhang; Zuxin Liu; Liangwei Yang; Rithesh R N; Tian Lan; Ming Zhu; Juntao Tan; Shirley Kokane; Thai Quoc Hoang; Juan Carlos Niebles; Shelby Heinecke; Huan Wang; Silvio Savarese; Caiming Xiong

doi:10.18653/v1/2024.conll-1.33

PRACT: Optimizing Principled Reasoning and Acting of LLM Agent

Zhiwei Liu, Weiran Yao, Jianguo Zhang, Zuxin Liu, Liangwei Yang, Rithesh R N, Tian Lan, Ming Zhu, Juntao Tan, Shirley Kokane, Thai Quoc Hoang, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong

Abstract

We introduce the Principled Reasoning and Acting (PRAct) framework, a novel method for learning and enforcing action principles from trajectory data. Central to our approach is the use of text gradients from a reflection and optimization engine to derive these action principles. To adapt action principles to specific task requirements, we propose a new optimization framework, Reflective Principle Optimization (RPO). After execution, RPO employs a reflector to critique current action principles and an optimizer to update them accordingly.We investigate the RPO framework under two scenarios: Reward-RPO, which uses environmental rewards for reflection, and Self-RPO, which conducts self-reflection without external rewards. Additionally, we developed two RPO methods, RPO-Traj and RPO-Batch, to adapt to different settings.Experimental results across four environments demonstrate that the PRAct agent, leveraging the RPO framework, can effectively learn and apply action principles to enhance performance.

Anthology ID:: 2024.conll-1.33
Volume:: Proceedings of the 28th Conference on Computational Natural Language Learning
Month:: November
Year:: 2024
Address:: Miami, FL, USA
Editors:: Libby Barak, Malihe Alikhani
Venue:: CoNLL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 442–446
Language:
URL:: https://aclanthology.org/2024.conll-1.33
DOI:: 10.18653/v1/2024.conll-1.33
Bibkey:
Cite (ACL):: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Zuxin Liu, Liangwei Yang, Rithesh R N, Tian Lan, Ming Zhu, Juntao Tan, Shirley Kokane, Thai Quoc Hoang, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, and Caiming Xiong. 2024. PRACT: Optimizing Principled Reasoning and Acting of LLM Agent. In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 442–446, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):: PRACT: Optimizing Principled Reasoning and Acting of LLM Agent (Liu et al., CoNLL 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2024.conll-1.33.pdf

PDF Search