Synthetic Lyrics Detection Across Languages and Genres
Yanis Labrak, Markus Frohmann, Gabriel Meseguer-Brocal, Elena V. Epure
Abstract
In recent years, the use of large language models (LLMs) to generate music content, particularly lyrics, has gained in popularity. These advances provide valuable tools for artists and enhance their creative processes, but they also raise concerns about copyright violations, consumer satisfaction, and content spamming. Previous research has explored content detection in various domains. However, no work has focused on the text modality, lyrics, in music. To address this gap, we curated a diverse dataset of real and synthetic lyrics from multiple languages, music genres, and artists. The generation pipeline was validated using both humans and automated methods. We performed a thorough evaluation of existing synthetic text detection approaches on lyrics, a previously unexplored data type. We also investigated methods to adapt the best-performing features to lyrics through unsupervised domain adaptation. Following both music and industrial constraints, we examined how well these approaches generalize across languages, scale with data availability, handle multilingual language content, and perform on novel genres in few-shot settings. Our findings show promising results that could inform policy decisions around AI-generated music and enhance transparency for users.- Anthology ID:
- 2025.trustnlp-main.34
- Volume:
- Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)
- Month:
- May
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Trista Cao, Anubrata Das, Tharindu Kumarage, Yixin Wan, Satyapriya Krishna, Ninareh Mehrabi, Jwala Dhamala, Anil Ramakrishna, Aram Galystan, Anoop Kumar, Rahul Gupta, Kai-Wei Chang
- Venues:
- TrustNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 524–541
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.trustnlp-main.34/
- DOI:
- Cite (ACL):
- Yanis Labrak, Markus Frohmann, Gabriel Meseguer-Brocal, and Elena V. Epure. 2025. Synthetic Lyrics Detection Across Languages and Genres. In Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025), pages 524–541, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Synthetic Lyrics Detection Across Languages and Genres (Labrak et al., TrustNLP 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.trustnlp-main.34.pdf