CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement

Odwitiyo Dutta, Dinesh K Vishwakarma


Abstract
Speaker diarization systems produce segmentation errors, such as false splits and boundary misplacements, that degrade transcript readability and downstream applications. We present CBAL (Context-Based Agentic Learning), a post-processing framework that refines segmentation boundaries in diarized scripts through targeted error correction. CBAL detects potential segmentation errors using acoustic and temporal heuristics and employs a lightweight LLM agent to reason about merge decisions, validating corrections through uncertainty-aware filtering with signal-based constraints. CBAL achieves 93.4% accuracy across 359 applied merges and reduces segment count by 6.1%. We demonstrate that our framework identifies and corrects high-confidence errors while maintaining 0% degradation in terms of concatenated minimum-permutation Word Error Rate (cpWER). An ablation study confirms that each component contributes non-redundantly, demonstrating the viability of interpretable refinement frameworks that use the strengths of acoustic models and language understanding without requiring end-to-end retraining.
Anthology ID:
2026.acl-srw.58
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
632–647
Language:
URL:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.58/
DOI:
Bibkey:
Cite (ACL):
Odwitiyo Dutta and Dinesh K Vishwakarma. 2026. CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 632–647, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement (Dutta & Vishwakarma, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.58.pdf