Abstract
Healthcare tasks such as predicting clinical outcomes across medical and surgical populations, disease prediction, predicting patient health journeys, are typically approached with supervised learning on task-specific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of billions of administrative claims, which essentially encapsulates the practice of medicine, offering a unique perspective on patient care and treatment patterns. Our model, MediClaimGPT, a 125M parameter Transformer demonstrates strong zero-shot predictive capabilities, accurately forecasting patient health events across four evaluation datasets, with its capabilities further demonstrated in various downstream tasks. A significant application of MediClaimGPT is in generating high-quality, clinically plausible synthetic claims data, enhancing healthcare data utility while preserving patient privacy. This research underscores the potential of language models in handling complex datasets and their strategic application in healthcare and related fields.- Anthology ID:
- 2024.naacl-industry.37
- Volume:
- Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Yi Yang, Aida Davani, Avi Sil, Anoop Kumar
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 427–436
- Language:
- URL:
- https://aclanthology.org/2024.naacl-industry.37
- DOI:
- 10.18653/v1/2024.naacl-industry.37
- Cite (ACL):
- Teja Kanchinadam and Gauher Shaheen. 2024. Large Language Models Encode the Practice of Medicine. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track), pages 427–436, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Large Language Models Encode the Practice of Medicine (Kanchinadam & Shaheen, NAACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.naacl-industry.37.pdf