Effective Speaker Diarization Leveraging Multi-task Logarithmic Loss Objectives
Jhih-Rong Guo, Tien-Hong Lo, Yu-Sheng Tsao, Pei-Ying Lee, Yung-Chang Hsu, Berlin Chen
Abstract
End-to-End Neural Diarization (EEND) has undergone substantial development, particularly with powerset classification methods that enhance performance but can exacerbate speaker confusion. To address this, we propose a novel training strategy that complements the standard cross entropy loss with an auxiliary ordinal log loss, guided by a distance matrix of speaker combinations. Our experiments reveal that while this approach yields significant relative improvements of 15.8% in false alarm rate and 10.0% in confusion error rate, it also uncovers a critical trade-off with an increased missed error rate. The primary contribution of this work is the identification and analysis of this trade-off, which stems from the model adopting a more conservative prediction strategy. This insight is crucial for designing more balanced and effective loss functions in speaker diarization.- Anthology ID:
- 2025.rocling-main.17
- Volume:
- Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
- Month:
- November
- Year:
- 2025
- Address:
- National Taiwan University, Taipei City, Taiwan
- Editors:
- Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
- Venue:
- ROCLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 140–145
- Language:
- URL:
- https://preview.aclanthology.org/dashboard/2025.rocling-main.17/
- DOI:
- Cite (ACL):
- Jhih-Rong Guo, Tien-Hong Lo, Yu-Sheng Tsao, Pei-Ying Lee, Yung-Chang Hsu, and Berlin Chen. 2025. Effective Speaker Diarization Leveraging Multi-task Logarithmic Loss Objectives. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 140–145, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
- Cite (Informal):
- Effective Speaker Diarization Leveraging Multi-task Logarithmic Loss Objectives (Guo et al., ROCLING 2025)
- PDF:
- https://preview.aclanthology.org/dashboard/2025.rocling-main.17.pdf