PriMock57: A Dataset Of Primary Care Mock Consultations
Alex Papadopoulos Korfiatis, Francesco Moramarco, Radmila Sarac, Aleksandar Savkov
Abstract
Recent advances in Automatic Speech Recognition (ASR) have made it possible to reliably produce automatic transcripts of clinician-patient conversations. However, access to clinical datasets is heavily restricted due to patient privacy, thus slowing down normal research practices. We detail the development of a public access, high quality dataset comprising of 57 mocked primary care consultations, including audio recordings, their manual utterance-level transcriptions, and the associated consultation notes. Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.- Anthology ID:
- 2022.acl-short.65
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 588–598
- Language:
- URL:
- https://aclanthology.org/2022.acl-short.65
- DOI:
- 10.18653/v1/2022.acl-short.65
- Cite (ACL):
- Alex Papadopoulos Korfiatis, Francesco Moramarco, Radmila Sarac, and Aleksandar Savkov. 2022. PriMock57: A Dataset Of Primary Care Mock Consultations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 588–598, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- PriMock57: A Dataset Of Primary Care Mock Consultations (Papadopoulos Korfiatis et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.acl-short.65.pdf