Ziqing Ji
2026
CENT: Context Engineering Framework for Normalization of Clinical Trial Procedures
Sanya Taneja | Ziqing Ji | Hans Verstraete | Ali Samadani
BioNLP 2026
Sanya Taneja | Ziqing Ji | Hans Verstraete | Ali Samadani
BioNLP 2026
Clinical Concept Normalization is essential for clinical research applications involving trial protocols, such as patient-trial matching. Existing approaches focus heavily on specific domains and need large, annotated datasets. To address these challenges, we propose CENT, a context engineering framework that combines semantic matching for candidate retrieval and Large Language Model (LLM) prompting for disambiguation. We applied CENT on a publicly available dataset of procedures normalized to Current Procedural Terminology (CPT) concepts and evaluated the framework using both binary and hierarchical metrics that take into account hierarchical characteristics of predicted codes. CENT achieves superior performance on clinical procedures normalization in both binary and hierarchical metrics compared to semantic matching or LLM-only approaches, without requiring fine-tuning. Advanced prompt strategies, including Chain-of-Thought and Tree-of-Thoughts, achieve high performance at practical cost. We further applied CENT to predict codes in two clinical protocol-derived datasets to validate its performance on noisy procedure texts. CENT is scalable and adaptable for normalization across diverse clinical vocabularies in real-world clinical applications.