Gatekeeper to save COGS and improve efficiency of Text Prediction
Nidhi Tiwari, Sneha Kola, Milos Milunovic, Si-qing Chen, Marjan Slavkovski
Abstract
The text prediction (TP) workflow calls a Large Language Model (LLM), almost, after every character to get subsequent sequence of characters, till user accepts a suggestion. The confidence score of the prediction is commonly used for filtering the results to ensure that only correct predictions are shown to user. As LLMs require massive amounts of computation and storage, such an approach incurs network and high execution cost. So, we propose a Model gatekeeper (GK) to stop the LLM calls that will result in incorrect predictions at client application level itself. This way a GK can save cost of model inference and improve user experience by not showing the incorrect predictions. We demonstrate that use of a model gatekeeper saved approx 46.6% of COGS for TP, at the cost of approx 4.5% loss in character saving. Use of GK also improved the efficiency (suggestion rate) of TP model by 73%.- Anthology ID:
- 2023.emnlp-industry.5
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Mingxuan Wang, Imed Zitouni
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 46–53
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-industry.5
- DOI:
- 10.18653/v1/2023.emnlp-industry.5
- Cite (ACL):
- Nidhi Tiwari, Sneha Kola, Milos Milunovic, Si-qing Chen, and Marjan Slavkovski. 2023. Gatekeeper to save COGS and improve efficiency of Text Prediction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 46–53, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Gatekeeper to save COGS and improve efficiency of Text Prediction (Tiwari et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.emnlp-industry.5.pdf