SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
Journal:
arXiv
Published Date:
Feb 27, 2025
Abstract
Cardiovascular diseases are a leading cause of death and disability
worldwide. Electrocardiogram (ECG) recordings are critical for diagnosing and
monitoring cardiac health, but obtaining large-scale annotated ECG datasets is
labor-intensive and time-consuming. Recent ECG Self-Supervised Learning (eSSL)
methods mitigate this by learning features without extensive labels but fail to
capture fine-grained clinical semantics and require extensive task-specific
fine-tuning. To address these challenges, we propose $\textbf{SuPreME}$, a
$\textbf{Su}$pervised $\textbf{Pre}$-training framework for
$\textbf{M}$ultimodal $\textbf{E}$CG representation learning. SuPreME applies
Large Language Models (LLMs) to extract structured clinical entities from
free-text ECG reports, filter out noise and irrelevant content, enhance
clinical representation learning, and build a high-quality, fine-grained
labeled dataset. By using text-based cardiac queries instead of traditional
categorical labels, SuPreME enables zero-shot classification of unseen diseases
without additional fine-tuning. We evaluate SuPreME on six downstream datasets
covering 127 cardiac conditions, achieving superior zero-shot AUC performance
over state-of-the-art eSSL and multimodal methods by over 1.96\%. Results
demonstrate the effectiveness of SuPreME in leveraging structured, clinically
relevant knowledge for high-quality ECG representations. All code and data will
be released upon acceptance.