Machine Learning of Plasma Proteomics Classifies Diagnosis of Interstitial Lung Disease.

Journal: American journal of respiratory and critical care medicine
Published Date:

Abstract

Distinguishing connective tissue disease-associated interstitial lung disease (CTD-ILD) from idiopathic pulmonary fibrosis (IPF) can be clinically challenging. To identify proteins that separate and classify patients with CTD-ILD and those with IPF. Four registries with 1,247 patients with IPF and 352 patients with CTD-ILD were included in analyses. Plasma samples were subjected to high-throughput proteomics assays. Protein features were prioritized using recursive feature elimination to construct a proteomic classifier. Multiple machine learning models, including support vector machine, LASSO (least absolute shrinkage and selection operator) regression, random forest, and imbalanced Random Forest, were trained and tested in independent cohorts. The validated models were used to classify each case iteratively in external datasets. A classifier with 37 proteins (proteomic classifier 37 [PC37]) was enriched in the biological process of bronchiole development and smooth muscle proliferation and immune responses. Four machine learning models used PC37 with sex and age score to generate continuous classification values. Receiver operating characteristic curve analyses of these scores demonstrated consistent areas under the curve of 0.85-0.90 in the test cohort and 0.94-0.96 in the single-sample dataset. Binary classification demonstrated 78.6-80.4% sensitivity and 76-84.4% specificity in the test cohort and 93.5-96.1% sensitivity and 69.5-77.6% specificity in the single-sample classification dataset. Composite analysis of all machine learning models confirmed 78.2% (194 of 248) accuracy in the test cohort and 82.9% (208 of 251) in the single-sample classification dataset. Multiple machine learning models trained with large cohort proteomic datasets consistently distinguished CTD-ILD from IPF. Many of the identified proteins are involved in immune pathways. We further developed a novel approach for single-sample classification, which could facilitate honing the differential diagnosis of ILD in challenging cases and improve clinical decision making.

Authors

  • Yong Huang
    State Key Laboratory for the Chemistry and Molecular Engineering of Medicinal Resources, Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection of Ministry Education, Guangxi Normal University, Guilin 541004, China.
  • Shwu-Fan Ma
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Justin M Oldham
    Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, MI; Department of Epidemiology, University of Michigan, Ann Arbor, MI. Electronic address: oldhamj@med.umich.edu.
  • Ayodeji Adegunsoye
    Division of Pulmonary and Critical Care Medicine, University of Chicago, Chicago, IL.
  • Daisy Zhu
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Susan Murray
    Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan.
  • John S Kim
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Catherine Bonham
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Emma Strickland
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Angela L Linderholm
    Division of Pulmonary, Critical Care and Sleep Medicine, University of California, Davis, Davis, California.
  • Cathryn T Lee
    Section of Pulmonary and Critical Care, Department of Medicine.
  • Tessy Paul
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Hannah Mannem
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.
  • Toby M Maher
    Division of Pulmonary, Critical Care and Sleep Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.
  • Philip L Molyneaux
    National Heart and Lung Institute, Imperial College London, London, United Kingdom.
  • Mary E Strek
    Section of Pulmonary and Critical Care, Department of Medicine.
  • Fernando J Martinez
    9 Division of Pulmonary and Critical Care Medicine, Weill Cornell Medical College, New York, New York.
  • Imre Noth
    Division of Pulmonary and Critical Care Medicine, University of Virginia, Charlottesville, Virginia.