MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian


Abstract
AI-empowered music processing is a diverse feld that encompasses dozens of tasks, ranging from generation tasks (e.g., timbre synthesis) to comprehension tasks (e.g., music classifcation). For developers and amateurs, it is very diffcult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks. Consequently, it is necessary to build a system to organize and integrate these tasks, and thus help practitioners to automatically analyze their demand and call suitable tools as solutions to fulfill their requirements. Inspired by the recent success of large language models (LLMs) in task automation, we develop a system, named MusicAgent, which integrates numerous music-related tools and an autonomous workflow to address user requirements. More specifically, we build 1) toolset that collects tools from diverse sources, including Hugging Face, GitHub, and Web API, etc. 2) an autonomous workflow empowered by LLMs (e.g., ChatGPT) to organize these tools and automatically decompose user requests into multiple sub-tasks and invoke corresponding music tools. The primary goal of this system is to free users from the intricacies of AI-music tools, enabling them to concentrate on the creative aspect. By granting users the freedom to effortlessly combine tools, the system offers a seamless and enriching music experience. The code is available on GitHub along with a brief instructional video.
Anthology ID:
2023.emnlp-demo.21
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
December
Year:
2023
Address:
Singapore
Editors:
Yansong Feng, Els Lefever
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
246–255
Language:
URL:
https://aclanthology.org/2023.emnlp-demo.21
DOI:
10.18653/v1/2023.emnlp-demo.21
Bibkey:
Cite (ACL):
Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, and Jiang Bian. 2023. MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 246–255, Singapore. Association for Computational Linguistics.
Cite (Informal):
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models (Yu et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2023.emnlp-demo.21.pdf