A Training Data Recipe to Accelerate A* Search with Language Models

Devaansh Gupta; Boyang Li

doi:10.18653/v1/2024.findings-emnlp.391

A Training Data Recipe to Accelerate A* Search with Language Models

Abstract

Combining Large Language Models (LLMs) with heuristic search algorithms like A* holds the promise of enhanced LLM reasoning and scalable inference. To accelerate training and reduce computational demands, we investigate the coreset selection problem for the training data of LLM heuristic learning. Few methods to learn the heuristic functions consider the interaction between the search algorithm and the machine learning model. In this work, we empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task. Surprisingly, we find an overlap between their requirements; A* requires more accurate predictions on search nodes near the goal, and LLMs need the same set of nodes for effective generalisation. With these insights, we derive a data-selection distribution for learning LM-based heuristics. On three classical planning domains, maze navigation, Sokoban and sliding tile puzzles, our technique reduces the number of iterations required to find the solutions by up to 15x, with a wall-clock speed-up of search up to 5x. The code has been made available at https://github.com/devaansh100/a_star.

Anthology ID:: 2024.findings-emnlp.391
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6681–6695
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.391/
DOI:: 10.18653/v1/2024.findings-emnlp.391
Bibkey:
Cite (ACL):: Devaansh Gupta and Boyang Li. 2024. A Training Data Recipe to Accelerate A* Search with Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 6681–6695, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: A Training Data Recipe to Accelerate A* Search with Language Models (Gupta & Li, Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.391.pdf

PDF Cite Search Fix data