A French Medical Conversations Corpus Annotated for a Virtual Patient Dialogue System

Fréjus A. A. Laleye, Gaël de Chalendar, Antonia Blanié, Antoine Brouquet, Dan Behnamou


Abstract
Data-driven approaches for creating virtual patient dialogue systems require the availability of large data specific to the language,domain and clinical cases studied. Based on the lack of dialogue corpora in French for medical education, we propose an annotatedcorpus of dialogues including medical consultation interactions between doctor and patient. In this work, we detail the building processof the proposed dialogue corpus, describe the annotation guidelines and also present the statistics of its contents. We then conducted aquestion categorization task to evaluate the benefits of the proposed corpus that is made publicly available.
Anthology ID:
2020.lrec-1.72
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
574–580
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.72
DOI:
Bibkey:
Cite (ACL):
Fréjus A. A. Laleye, Gaël de Chalendar, Antonia Blanié, Antoine Brouquet, and Dan Behnamou. 2020. A French Medical Conversations Corpus Annotated for a Virtual Patient Dialogue System. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 574–580, Marseille, France. European Language Resources Association.
Cite (Informal):
A French Medical Conversations Corpus Annotated for a Virtual Patient Dialogue System (Laleye et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.72.pdf
Code
 kleag/labforsims2-corpus