Demographic-aware temporal graph attention for fair and accurate cardiac abnormality detection in 12-lead ECG.
Journal:
Scientific reports
Published Date:
Jun 4, 2026
Abstract
Automated 12-lead ECG interpretation has achieved strong diagnostic accuracy through deep learning, yet systematic demographic disparities-particularly between male and female patients-undermine the equitable deployment of these systems in clinical practice. Existing fairness approaches treat bias as a post-hoc correction, frequently at the cost of diagnostic performance. This paper introduces DA-GAT-v2, a Demographic-Aware Graph Attention Network designed to simultaneously advance diagnostic accuracy and algorithmic fairness in multi-label cardiac abnormality detection. Three clinically motivated architectural innovations are integrated: a lead-wise Temporal Convolutional Encoder (TCE) replacing coarse statistical node features with 128-dimensional morphologically rich embeddings capturing sex- and age-specific PQRST characteristics; a dynamic α-Net that predicts patient-specific inter-lead graph topologies by adaptively balancing anatomical adjacency and signal correlation, reflecting demographic-dependent cardiac geometry; and Feature-wise Linear Modulation (FiLM) integrated into every graph attention layer, enabling independent feature-wise demographic conditioning with substantially greater expressiveness than prior scalar gating approaches. These innovations are optimized through a three-stage curriculum training strategy incorporating a composite fairness regularization loss combining equalized odds and demographic parity constraints. Evaluated on PTB-XL (21,507 recordings), DA-GAT-v2 achieves macro F1 of 0.8952 and AUROC of 0.9762, surpassing all compared baselines. The male-female diagnostic performance gap is reduced from 15.42% to 1.75%, with an equalized odds difference (EO) of 0.0423-well within the clinical acceptance threshold of 0.10. Cross-dataset validation on Chapman-Shaoxing (10,646 recordings) confirms generalization with a minimal F1 degradation of 0.0150. Ablation studies quantify each component's independent contribution, and attention maps reveal clinically coherent demographic-dependent lead prioritization. These results establish DA-GAT-v2 as a technically sound candidate for further clinical evaluation, demonstrating that diagnostic accuracy and demographic fairness in ECG AI are complementary objectives achievable through principled architectural design. Fairness evaluation is currently limited to sex and age subgroups owing to the absence of race and ethnicity metadata in both utilized datasets.
Authors
Keywords
No keywords available for this article.