A Language-First Approach for Procedure Planning

Jiateng Liu, Sha Li, Zhenhailong Wang, Manling Li, Heng Ji


Abstract
Procedure planning, or the ability to predict a series of steps that can achieve a given goal conditioned on the current observation, is critical for building intelligent embodied agents that can assist users in everyday tasks. Encouraged by the recent success of language models (LMs) for zero-shot and few-shot planning, we hypothesize that LMs may be equipped with stronger priors for planning compared to their visual counterparts. To this end, we propose a language-first procedure planning framework with a modularized design: we first align the current and goal observations with corresponding steps and then use a pre-trained LM to predict the intermediate steps. Under this framework, we find that using an image captioning model for alignment can already match state-of-the-art performance and by designing a double retrieval model conditioned over current and goal observations jointly, we can achieve large improvements (19.2%-98.9% relatively higher success rate than state-of-the-art) on both COIN and CrossTask benchmarks. Our work verifies the planning ability of LMs and demonstrates how LMs can serve as a powerful “reasoning engine” even when the input is provided in another modality.
Anthology ID:
2023.findings-acl.122
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1941–1954
Language:
URL:
https://aclanthology.org/2023.findings-acl.122
DOI:
10.18653/v1/2023.findings-acl.122
Bibkey:
Cite (ACL):
Jiateng Liu, Sha Li, Zhenhailong Wang, Manling Li, and Heng Ji. 2023. A Language-First Approach for Procedure Planning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1941–1954, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Language-First Approach for Procedure Planning (Liu et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.findings-acl.122.pdf