Ryan Puterbaugh
2026
Completing and Validating the Re-Aligned Switchboard Dialog Act Corpus
Run Chen | Zihao Tao | John Prado | Ignazio LaManna | Ryan Puterbaugh | Mim Datta | Julia Hirschberg
Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Run Chen | Zihao Tao | John Prado | Ignazio LaManna | Ryan Puterbaugh | Mim Datta | Julia Hirschberg
Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Although widely used in dialog act prediction and generation, the Switchboard Dialog Act (SwDA) corpus has performed poorly in models incorporating prosodic information because of misalignment between speech and text data. In this paper, we report our completion of the work begun in Chen et al. (2024) in addressing these misalignment issues with an improved SwDA corpus called RASwDA (Re-Aligned Switchboard Dialog Act Corpus). Now fully re-aligned and validated, RASwDA finally meets standards of accuracy allowing for classification models trained on it to exceed classification benchmarks set by models trained on other Switchboard subcorpora.