Mingyuan Ma

2025

pdf bib abs
Octopus: On-device language model for function calling of software APIs
Wei Chen | Zhiyuan Li | Mingyuan Ma
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Large Language Models (LLMs) are pivotal for advanced text processing and generation. This study presents a framework to train a series of on-device LLMs optimized for invoking software APIs. Using a curated dataset of 30,000 API function calls from software documentation, we fine-tune LLMs with 2B, 3B, and 7B parameters to enhance their proficiency in API interactions. Our approach improves the understanding of API structures and syntax, leading to significantly better accuracy in API function calls. We also propose a conditional masking technique to enforce correct output formats, reducing errors while maintaining inference speed, specifically tailored for API tasks. The fine-tuned model, Octopus, outperforms GPT-4 in API calling tasks, showcasing advancements in automated software development and API integration. The model checkpoints are publicly available.

Co-authors

Wei Chen 1
Zhiyuan Li 1

Venues

naacl1

Fix data