Polish-English medical knowledge transfer: A new benchmark and results
Łukasz Grzybowski, Jakub Pokrywka, Michał Ciesiółka, Jeremi Ignacy Kaczmarek, Marek Kubis
Abstract
Large Language Models (LLMs) have demonstrated significant potential in specialized tasks, including medical problem-solving. However, most studies predominantly focus on English-language contexts. This study introduces a novel benchmark dataset based on Polish medical licensing and specialization exams (LEK, LDEK, PES). The dataset, sourced from publicly available materials provided by the Medical Examination Center and the Chief Medical Chamber, includes Polish medical exam questions, along with a subset of parallel Polish-English corpora professionally translated for foreign candidates. By structuring a benchmark from these exam questions, we evaluate state-of-the-art LLMs, spanning general-purpose, domain-specific, and Polish-specific models, and compare their performance with that of human medical students and doctors. Our analysis shows that while models like GPT-4o achieve near-human performance, challenges persist in cross-lingual translation and domain-specific understanding. These findings highlight disparities in model performance across languages and medical specialties, emphasizing the limitations and ethical considerations of deploying LLMs in clinical practice.- Anthology ID:
- 2025.findings-emnlp.480
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9042–9063
- Language:
- URL:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.480/
- DOI:
- 10.18653/v1/2025.findings-emnlp.480
- Cite (ACL):
- Łukasz Grzybowski, Jakub Pokrywka, Michał Ciesiółka, Jeremi Ignacy Kaczmarek, and Marek Kubis. 2025. Polish-English medical knowledge transfer: A new benchmark and results. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9042–9063, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Polish-English medical knowledge transfer: A new benchmark and results (Grzybowski et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.480.pdf