Ryan Puterbaugh


2026

Although widely used in dialog act prediction and generation, the Switchboard Dialog Act (SwDA) corpus has performed poorly in models incorporating prosodic information because of misalignment between speech and text data. In this paper, we report our completion of the work begun in Chen et al. (2024) in addressing these misalignment issues with an improved SwDA corpus called RASwDA (Re-Aligned Switchboard Dialog Act Corpus). Now fully re-aligned and validated, RASwDA finally meets standards of accuracy allowing for classification models trained on it to exceed classification benchmarks set by models trained on other Switchboard subcorpora.