FedCliMask: Context-Aware Federated Learning with Ontology-Guided Semantic Masking for Clinical NLP
Srijit Paul, Sajeeb Das, Ucchas Muhury, Akib Jayed Islam, Dhruba Jyoti Barua, Sultanus Salehin, Prasun Datta
Abstract
Clinical federated learning faces critical challenges from statistical heterogeneity across healthcare institutions and privacy requirements for sensitive medical data. This work implements the foundational components of FedCliMask and proposes a comprehensive framework for privacy-preserving federated learning in clinical settings that combines ontology-guided semantic masking with context-aware federated aggregation. Our framework addresses the dual challenges of privacy preservation and statistical heterogeneity through two key innovations: (1) ontology-guided semantic masking using UMLS hierarchies to provide graduated privacy protection while preserving clinical semantics, and (2) context-aware federated aggregation that considers hospital-specific features including medical specialties, data complexity, privacy levels, and data volume. The semantic masking component is implemented and evaluated on synthetic clinical data, demonstrating effective privacy-utility tradeoffs across four masking levels. The context-aware analysis component is also implemented successfully profiling 12,996 synthetic clinical notes across 6 diverse hospitals to demonstrate meaningful hospital differentiation. The complete framework is designed to enable privacy-preserving clinical trial recruitment through federated learning while adapting to institutional heterogeneity.- Anthology ID:
- 2025.globalnlp-1.19
- Volume:
- Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
- Month:
- September
- Year:
- 2025
- Address:
- Varna, Bulgaria
- Editors:
- Sudhansu Bala Das, Pruthwik Mishra, Alok Singh, Shamsuddeen Hassan Muhammad, Asif Ekbal, Uday Kumar Das
- Venues:
- GlobalNLP | WS
- SIG:
- Publisher:
- INCOMA Ltd., Shoumen, BULGARIA
- Note:
- Pages:
- 172–180
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2026-01/2025.globalnlp-1.19/
- DOI:
- Cite (ACL):
- Srijit Paul, Sajeeb Das, Ucchas Muhury, Akib Jayed Islam, Dhruba Jyoti Barua, Sultanus Salehin, and Prasun Datta. 2025. FedCliMask: Context-Aware Federated Learning with Ontology-Guided Semantic Masking for Clinical NLP. In Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models, pages 172–180, Varna, Bulgaria. INCOMA Ltd., Shoumen, BULGARIA.
- Cite (Informal):
- FedCliMask: Context-Aware Federated Learning with Ontology-Guided Semantic Masking for Clinical NLP (Paul et al., GlobalNLP 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2026-01/2025.globalnlp-1.19.pdf