SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning

Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, Rossella Arcucci


Abstract
Cardiovascular diseases are a leading cause of death and disability worldwide. Electrocardiogram (ECG) is critical for diagnosing and monitoring cardiac health, but obtaining large-scale annotated ECG datasets is labor-intensive and time-consuming. Recent ECG Self-Supervised Learning (eSSL) methods mitigate this by learning features without extensive labels but fail to capture fine-grained clinical semantics and require extensive task-specific fine-tuning. To address these challenges, we propose SuPreME, a Supervised Pre-training framework for Multimodal ECG representation learning. SuPreME is pre-trained using structured diagnostic labels derived from ECG report entities through a one-time offline extraction with Large Language Models (LLMs), which help denoise, standardize cardiac concepts, and improve clinical representation learning. By fusing ECG signals with textual cardiac queries instead of fixed labels, SuPreME enables zero-shot classification of unseen conditions without further fine-tuning. We evaluate SuPreME on six downstream datasets covering 106 cardiac conditions, achieving superior zero-shot AUC performance of 77.20%, surpassing state-of-the-art eSSLs by 4.98%. Results demonstrate SuPreME’s effectiveness in leveraging structured, clinically relevant knowledge for high-quality ECG representations.
Anthology ID:
2025.findings-emnlp.633
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11817–11844
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.633/
DOI:
10.18653/v1/2025.findings-emnlp.633
Bibkey:
Cite (ACL):
Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, and Rossella Arcucci. 2025. SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 11817–11844, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning (Cai et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.633.pdf
Checklist:
 2025.findings-emnlp.633.checklist.pdf