CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement

Odwitiyo Dutta, Dinesh K Vishwakarma


Abstract
Speaker diarization systems produce segmentation errors, such as false splits and boundary misplacements, that degrade transcript readability and downstream applications. We present CBAL (Context-Based Agentic Learning), a post-processing framework that refines segmentation boundaries in diarized scripts through targeted error correction. CBAL detects potential segmentation errors using acoustic and temporal heuristics and employs a lightweight LLM agent to reason about merge decisions, validating corrections through uncertainty-aware filtering with signal-based constraints. CBAL achieves 93.4% accuracy across 359 applied merges and reduces segment count by 6.1%. We demonstrate that our framework identifies and corrects high-confidence errors while maintaining 0% degradation in terms of concatenated minimum-permutation Word Error Rate (cpWER). An ablation study confirms that each component contributes non-redundantly, demonstrating the viability of interpretable refinement frameworks that use the strengths of acoustic models and language understanding without requiring end-to-end retraining.
Anthology ID:
2026.acl-srw.58
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
632–647
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.58/
DOI:
Bibkey:
Cite (ACL):
Odwitiyo Dutta and Dinesh K Vishwakarma. 2026. CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 632–647, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement (Dutta & Vishwakarma, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.58.pdf