Improving Barrett’s Oesophagus Surveillance Scheduling with Large Language Models: A Structured Extraction Approach

Xinyue Zhang; Agathe Zecevic; Sebastian Zeki; Angus Roberts

Improving Barrett’s Oesophagus Surveillance Scheduling with Large Language Models: A Structured Extraction Approach

Xinyue Zhang, Agathe Zecevic, Sebastian Zeki, Angus Roberts

Abstract

Gastroenterology (GI) cancer surveillance scheduling relies on extracting structured data from unstructured clinical texts, such as endoscopy and pathology reports. Traditional Natural Language Processing (NLP) models have been employed for this task, but recent advancements in Large Language Models (LLMs) present a new opportunity for automation without requiring extensive labeled datasets. In this study, we propose an LLM-based entity extraction and rule-based decision support framework for Barrett’s Oesophagus (BO) surveillance timing prediction. Our approach processes endoscopy and pathology reports to extract clinically relevant information and structures it into a standardised format, which is then used to determine appropriate surveillance intervals. We evaluate multiple state-of-the-art LLMs on real-world clinical datasets from two hospitals, assessing their performance in accuracy and running time cost. The results demonstrate that LLMs, particularly Phi-4 and (DeepSeek distilled) Qwen-2.5, can effectively automate the extraction of BO surveillance-related information with high accuracy, while Phi-4 is also efficient during inference. We also compared the trade-offs between LLMs and fine-tuned non-LLMs. Our findings indicate that LLM extraction based methods can support clinical decision-making by providing justifications from report extractions, reducing manual workload, and improving guideline adherence in BO surveillance scheduling.

Anthology ID:: 2025.bionlp-1.16
Volume:: ACL 2025
Month:: August
Year:: 2025
Address:: Viena, Austria
Editors:: Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Junichi Tsujii
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 176–189
Language:
URL:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.16/
DOI:
Bibkey:
Cite (ACL):: Xinyue Zhang, Agathe Zecevic, Sebastian Zeki, and Angus Roberts. 2025. Improving Barrett’s Oesophagus Surveillance Scheduling with Large Language Models: A Structured Extraction Approach. In ACL 2025, pages 176–189, Viena, Austria. Association for Computational Linguistics.
Cite (Informal):: Improving Barrett’s Oesophagus Surveillance Scheduling with Large Language Models: A Structured Extraction Approach (Zhang et al., BioNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-1.16.pdf
Supplementarymaterial:: 2025.bionlp-1.16.SupplementaryMaterial.zip
Supplementarymaterial:: 2025.bionlp-1.16.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data