Enhanced ECG Arrhythmia Detection Accuracy by Optimizing Divergence-Based Data Fusion
Journal:
arXiv
Published Date:
Mar 19, 2025
Abstract
AI computation in healthcare faces significant challenges when clinical
datasets are limited and heterogeneous. Integrating datasets from multiple
sources and different equipments is critical for effective AI computation but
is complicated by their diversity, complexity, and lack of representativeness,
so we often need to join multiple datasets for analysis. The currently used
method is fusion after normalization. But when using this method, it can
introduce redundant information, decreasing the signal-to-noise ratio and
reducing classification accuracy. To tackle this issue, we propose a
feature-based fusion algorithm utilizing Kernel Density Estimation (KDE) and
Kullback-Leibler (KL) divergence. Our approach involves initially preprocessing
and continuous estimation on the extracted features, followed by employing the
gradient descent method to identify the optimal linear parameters that minimize
the KL divergence between the feature distributions. Using our in-house
datasets consisting of ECG signals collected from 2000 healthy and 2000
diseased individuals by different equipments and verifying our method by using
the publicly available PTB-XL dataset which contains 21,837 ECG recordings from
18,885 patients. We employ a Light Gradient Boosting Machine (LGBM) model to do
the binary classification. The results demonstrate that the proposed fusion
method significantly enhances feature-based classification accuracy for
abnormal ECG cases in the merged datasets, compared to the normalization
method. This data fusion strategy provides a new approach to process
heterogeneous datasets for the optimal AI computation results.