PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences

Alan Chi-Man Lee; Wing-Sun Cheng; Calvin Chun-Kit Chan

PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences

Alan Chi-Man Lee, Wing-Sun Cheng, Calvin Chun-Kit Chan

Abstract

We propose PROMTEC, a novel multi-faceted approach to accelerate the inference of large language models (LLMs) by leveraging three key techniques: Prompt Multi-Lookup, Template Datastore, and Common Sequences methods. Prompt Multi-Lookup enhances the autoregressive decoding efficiency by generating multiple candidate sequences from context. Template Datastore exploits structured patterns, particularly in mathematical and code generation tasks, to enable fast and accurate candidate generation. Common Sequences optimize inference by precomputing frequent short sequences in specialized domains. For mathematical generation, PROMTEC achieves a 3.91 × speedup on the miniF2F benchmark. For code generation, it achieves up to a 4.23 × speedup on the HumanEval benchmark. This work highlights the potential of integrated candidate generation to accelerate LLM inference while maintaining high-quality outputs.

Anthology ID:: 2025.findings-acl.355
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6830–6842
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.355/
DOI:
Bibkey:
Cite (ACL):: Alan Chi-Man Lee, Wing-Sun Cheng, and Calvin Chun-Kit Chan. 2025. PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences. In Findings of the Association for Computational Linguistics: ACL 2025, pages 6830–6842, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences (Lee et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.355.pdf

PDF Cite Search Fix data