Machine Learning-Driven Glycoproteomic Profiling Identifies Novel Diabetes-Associated Glycosylation Biomarkers
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
Glycosylation plays a critical role in protein function and disease progression, including diabetes mellitus. This study performed a comprehensive glycoproteomic analysis comparing healthy volunteers (HV) and DM samples, identifying 19,374 peptides and 2,113 proteins, of which 1,104 were glycosylated. A total of 287 distinct glycans were mapped to 3,722 glycosylated peptides, revealing significant differences in glycosylation patterns between HV and DM samples. Statistical analysis identified 29 significantly altered glycosylation sites, with 23 upregulated in DM and 6 downregulated in DM. Notably, the glycan HexNAc(2)Hex(2)Fuc(1) at position 215 of Prosaposin was significantly upregulated in DM, marking its first reported association with diabetes. Machine learning models, particularly Support Vector Machines (SVM) and Generalized Linear Models (GLM), achieved high classification accuracy (∼ 92%: 96%) in distinguishing HV and DM samples based on glycosylation features (Glycans, Glycosylated proteins, and Glycosylation sites). These findings suggest that altered glycosylation patterns may serve as potential biomarkers for diabetes-related pathophysiology and therapeutic targeting.