Voice synthesis in Polish and English - analyzing prediction differences in speaker verification systems
Joanna Gajewska, Alicja Martinek, Michał J. Ołowski, Ewelina Bartuzi-Trokielewicz
Abstract
Deep learning has significantly enhanced voice synthesis, yielding realistic audio capable of mimicking individual voices. This progress, however, raises security concerns due to the potential misuse of audio deepfakes. Our research examines the effects of deepfakes on speaker recognition systems across English and Polish corpora, assessing both Text-to-Speech and Voice Conversion methods. We focus on the biometric similarity’s role in the effectiveness of impersonations and find that synthetic voices can maintain personal traits, posing risks of unauthorized access. The study’s key contributions include analyzing voice synthesis across languages, evaluating biometric resemblance in voice conversion, and contrasting Text-to-Speech and Voice Conversion paradigms. These insights emphasize the need for improved biometric security against audio deepfake threats.- Anthology ID:
- 2025.coling-main.643
- Volume:
- Proceedings of the 31st International Conference on Computational Linguistics
- Month:
- January
- Year:
- 2025
- Address:
- Abu Dhabi, UAE
- Editors:
- Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9618–9629
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-main.643/
- DOI:
- Cite (ACL):
- Joanna Gajewska, Alicja Martinek, Michał J. Ołowski, and Ewelina Bartuzi-Trokielewicz. 2025. Voice synthesis in Polish and English - analyzing prediction differences in speaker verification systems. In Proceedings of the 31st International Conference on Computational Linguistics, pages 9618–9629, Abu Dhabi, UAE. Association for Computational Linguistics.
- Cite (Informal):
- Voice synthesis in Polish and English - analyzing prediction differences in speaker verification systems (Gajewska et al., COLING 2025)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-main.643.pdf