A Brief Survey of Textual Dialogue Corpora

Hugo Gonçalo Oliveira, Patrícia Ferreira, Daniel Martins, Catarina Silva, Ana Alves


Abstract
Several dialogue corpora are currently available for research purposes, but they still fall short for the growing interest in the development of dialogue systems with their own specific requirements. In order to help those requiring such a corpus, this paper surveys a range of available options, in terms of aspects like speakers, size, languages, collection, annotations, and domains. Some trends are identified and possible approaches for the creation of new corpora are also discussed.
Anthology ID:
2022.lrec-1.135
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1264–1274
Language:
URL:
https://aclanthology.org/2022.lrec-1.135
DOI:
Bibkey:
Cite (ACL):
Hugo Gonçalo Oliveira, Patrícia Ferreira, Daniel Martins, Catarina Silva, and Ana Alves. 2022. A Brief Survey of Textual Dialogue Corpora. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1264–1274, Marseille, France. European Language Resources Association.
Cite (Informal):
A Brief Survey of Textual Dialogue Corpora (Gonçalo Oliveira et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.lrec-1.135.pdf
Data
CoQAEMOTyDAEmotionLinesMELDMultiWOZOpenSubtitlesQuACReDialSAMSumSGDTaskmaster-1Topical-ChatUDC