Completing and Validating the Re-Aligned Switchboard Dialog Act Corpus

Run Chen, Zihao Tao, John Prado, Ignazio LaManna, Ryan Puterbaugh, Mim Datta, Julia Hirschberg


Abstract
Although widely used in dialog act prediction and generation, the Switchboard Dialog Act (SwDA) corpus has performed poorly in models incorporating prosodic information because of misalignment between speech and text data. In this paper, we report our completion of the work begun in Chen et al. (2024) in addressing these misalignment issues with an improved SwDA corpus called RASwDA (Re-Aligned Switchboard Dialog Act Corpus). Now fully re-aligned and validated, RASwDA finally meets standards of accuracy allowing for classification models trained on it to exceed classification benchmarks set by models trained on other Switchboard subcorpora.
Anthology ID:
2026.law-main.13
Volume:
Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yang Janet Liu, Luke Gessler
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
173–177
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.13/
DOI:
Bibkey:
Cite (ACL):
Run Chen, Zihao Tao, John Prado, Ignazio LaManna, Ryan Puterbaugh, Mim Datta, and Julia Hirschberg. 2026. Completing and Validating the Re-Aligned Switchboard Dialog Act Corpus. In Proceedings of the 20th Linguistic Annotation Workshop (LAW XX), pages 173–177, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Completing and Validating the Re-Aligned Switchboard Dialog Act Corpus (Chen et al., LAW 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.13.pdf