Anna Lokrantz


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
MedQA-SWE - a Clinical Question & Answer Dataset for Swedish
Niclas Hertzberg | Anna Lokrantz
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Considering the rapid improvement of large generative language models, it is important to measure their ability to encode clinical domain knowledge in order to help determine their potential utility in a clinical setting. To this end we present MedQA-SWE – a novel multiple choice, clinical question & answering (Q&A) dataset in Swedish consisting of 3,180 questions. The dataset was created from a series of exams aimed at evaluating doctors’ clinical understanding and decision making and is the first open-source clinical Q&A dataset in Swedish. The exams – originally in PDF format – were parsed and each question manually checked and curated in order to limit errors in the dataset. We provide dataset statistics along with benchmark accuracy scores of seven large generative language models on a representative sample of questions in a zero-shot setting, with some models showing impressive performance given the difficulty of the exam the dataset is based on.