Integrated machine learning pipeline for aberrant biomarker enrichment (i-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
PMID:

Abstract

Clusters of differentiation () are cell surface biomarkers that denote key biological differences between cell types and disease state. CD-targeting therapeutic monoclonal antibodies () afford rich trans-disease repositioning opportunities. Within a compendium of systemic lupus erythematous () patients, we applied the Integrated machine learning pipeline for aberrant biomarker enrichment () to profile gene expression features affecting CD20, CD22 and CD30 gene aberrance. First, a novel Relief-based algorithm identified interdependent features(p=681) predicting treatment-naïve SLE patients (balanced accuracy=0.822). We then compiled CD-associated expression profiles using regularized logistic regression and pathway enrichment analyses. On an independent general cell line model system data, we replicated associations () of (p=1.69e-9) and (p=4.63e-8) with CD22; (p=7.00e-4), (p=1.71e-2), and (p=3.34e-2) with CD30; and , a phosphatase linked to bone mineralization, with both CD22(p=4.37e-2) and CD30(p=7.40e-3). Utilizing carefully aggregated secondary data and leveraging hypotheses, i-mAB fostered robust biomarker profiling among interdependent biological features.

Authors

  • Trang T Le
    Department of Biostatistics, Epidemiology, and Informatics.
  • Nigel O Blackwood
    Department of Biostatistics, Epidemiology, and Informatics.
  • Jaclyn N Taroni
    Department of Systems Pharmacology and Translational Therapeutics; Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
  • Weixuan Fu
    Department of Biostatistics, Epidemiology, and Informatics.
  • Matthew K Breitenstein
    Department of Biostatistics, Epidemiology, and Informatics.