Fine-tuning LLMs to Extract Epilepsy Seizure Frequency Data from Health Records
Ben Holgate, Joe Davies, Shichao Fang, Joel Winston, James Teo, Mark Richardson
Abstract
We developed a new methodology of extracting the frequency of a patient’s epilepsy seizures from unstructured, free-text outpatient clinic letters by: first, devising a singular unit of measurement for seizure frequency; and second, fine-tuning a generative Large Language Model (LLM) on our bespoke annotated dataset. We measured frequency by the number of seizures per month: one seizure or more requires an integer; and less than one a decimal. This approach enables us to track whether a patient”s seizures are improving or not over time. We found fine-tuning improves the F1 score of our best-performing LLM, Ministral-8B-Instruct-2410, by around three times compared to an untrained model. We also found Ministral demonstrated an impressive ability for mathematical reasoning.- Anthology ID:
- 2025.bionlp-1.5
- Volume:
- Proceedings of the 24th Workshop on Biomedical Language Processing
- Month:
- August
- Year:
- 2025
- Address:
- Viena, Austria
- Editors:
- Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Junichi Tsujii
- Venues:
- BioNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 44–55
- Language:
- URL:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.5/
- DOI:
- Cite (ACL):
- Ben Holgate, Joe Davies, Shichao Fang, Joel Winston, James Teo, and Mark Richardson. 2025. Fine-tuning LLMs to Extract Epilepsy Seizure Frequency Data from Health Records. In Proceedings of the 24th Workshop on Biomedical Language Processing, pages 44–55, Viena, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Fine-tuning LLMs to Extract Epilepsy Seizure Frequency Data from Health Records (Holgate et al., BioNLP 2025)
- PDF:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.5.pdf