DRB1 Subtyping Reveals Divergent Risk and Protection for Type 1 Diabetes in Middle Eastern Populations
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
Type 1 diabetes (T1D) is strongly influenced by HLA variation, yet current genetic risk models developed largely in European cohorts perform suboptimally in Middle Eastern populations due to region-specific allele frequencies, DR4 subtype heterogeneity, and distinct haplotype structures. We aimed to characterize HLA diversity in the Qatar Biobank (QBB) cohort and develop a Middle East optimized, machine learning–based T1D risk model (MENA T1D-GRS). We analyzed high-coverage whole-genome sequencing data from >14,000 individuals comprising 7,359 healthy controls, and 410 clinically diagnosed T1D patients plus 230 first-degree relatives (FDR). High-resolution HLA typing was performed using HLA-LA, HLA-HD, and Kourami. Haplotype phasing, LD estimation, and association testing identified population-specific risk and protective configurations. We computed GRS2 using 66 genomic variants and trained an XGBoost classifier integrating 79 weighted HLA features and GRS2 components. Synthetic data augmentation (ADASYN) was applied to correct the class imbalance between T1D cases and controls, thereby enhancing model sensitivity. Model discrimination was evaluated by AUCROC. The QBB cohort exhibited exceptional HLA diversity, with 305 DRB1-DQA1-DQB1 haplotypes. Established risk haplotypes, DR3-DQ2.5 and DR4-DQ8.1 were significantly enriched in T1D cases, with compound heterozygosity conferring >12-fold increased odds. Importantly, DRB1*04:03 was protective (OR=0.54), contrasting sharply with DRB1*04:02 and *04:05. GRS2 achieved an AUC of 0.74 vs. population controls and 0.65 vs. FDRs; AUC improved to 0.81 in autoantibody-positive cases. The MENA T1D-GRS model achieved AUC 0.79 (baseline) and 0.82 with ADASYN. Sensitivity improved to 75–80% in autoantibody-positive subgroups. SHAP analysis revealed allele-specific effects, highlighting the opposing roles of DR4 subtypes. The MENA T1D-GRS provides a population-tailored genomic risk prediction tool, outperforming existing scores and capturing non-linear HLA interactions. It supports early screening, differential diagnosis, and precision medicine efforts in Middle Eastern populations.