Efficient pretraining of ECG scalogram images using masked autoencoders for cardiovascular disease diagnosis.
Journal:
Scientific reports
Published Date:
Jul 8, 2025
Abstract
Cardiovascular diseases (CVDs) are the leading cause of mortality worldwide, emphasizing the need for accurate and early diagnosis. Electrocardiograms (ECG) provide a non-invasive means of diagnosing various cardiac conditions. However, traditional methods of interpreting ECG signals require substantial expertise and time, motivating the development of automated deep learning models to enhance diagnostic precision. This study proposes a novel approach that leverages masked autoencoders (MAE) to pretrain a model on ECG scalogram images, thereby enhancing the diagnostic accuracy for seven CVDs. Through extensive experimentation, we demonstrated that pretraining with an 85% masking ratio over 500 epochs yields optimal results. The pretrained ViT-S(MAE-scalo) network demonstrated remarkable performance in detecting CVDs, achieving an AUC of 0.986 and 92.43% accuracy in Lead II. Furthermore, the ensemble learning approach applied across 12 ECG leads enhanced the model's diagnostic capabilities, resulting in an AUC of 0.994 and 92.72% accuracy. The MAE-based models outperformed traditional models such as ResNet-34 and ViT-S pretrained on ImageNet or random weights, as well as other SSL models such as MoCo-v2 and BYOL. Notably, the MAE-based models demonstrated superior performance even with a significantly smaller dataset, using only 1/12th the size of the ImageNet dataset. These findings suggest that this efficient pretraining approach for deep learning models holds great potential for clinical application, particularly in resource-limited environments where labeled data is scarce. This method provides a scalable and cost-effective solution for improving CVD diagnosis.