A Multilingual Voice Analytics Module for Contact-Center Hiring

Wagner W. Ávila Bombardelli, Vanessa Marquiafavel Serrani, Edgard Kuboo, Erica C. Marins Missão


Abstract
Contact-center operations often face significant challenges in identifying candidates whose vocal performance aligns with high-quality customer interactions. Existing speech analytics tools typically assess only content, providing limited insight into how candidates speak. To address this gap, we introduce SR-Voice, a multilingual speech analytics module designed to support call-center hiring. SR-Voice extends a previous text-only auditor by integrating segment-level, audio-native analysis capable of generating judgments, concise evidence-based rationales, and 0–10 scores across three dimensions: Emotion, Communication, and Rhythm. Our two-stage architecture first applies an audio-native model to propose a label, which is then reassessed by a lightweight auditor that combines transcript cues with acoustic and timing indicators grounded in phonetic and prosodic theory. We evaluate SR-Voice on a production-like volunteer dataset, reporting strong agreement and calibration performance reaching Macro-F1 = 0.83; Expected Calibration Error (ECE) = 0.053. The hybrid system achieves state-of-the-art calibration without post-hoc adjustment, with the audio-only variant attaining the lowest Negative Log-Likelihood (NLL) = 0.472). Designed for operational practicality, SR-Voice emphasizes traceability, short rationales, and well-calibrated probabilities suitable for threshold-based decisions and human-in-the-loop triage. We also discuss privacy-preserving storage and the prospective masking of Personally Identifiable Information (PII) for archival data.
Anthology ID:
2026.propor-1.6
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
57–66
Language:
URL:
https://preview.aclanthology.org/ingest-dnd/2026.propor-1.6/
DOI:
Bibkey:
Cite (ACL):
Wagner W. Ávila Bombardelli, Vanessa Marquiafavel Serrani, Edgard Kuboo, and Erica C. Marins Missão. 2026. A Multilingual Voice Analytics Module for Contact-Center Hiring. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 57–66, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
A Multilingual Voice Analytics Module for Contact-Center Hiring (Bombardelli et al., PROPOR 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-dnd/2026.propor-1.6.pdf