Identification of Novel Genes in Human Airway Epithelial Cells associated with Chronic Obstructive Pulmonary Disease (COPD) using Machine-Based Learning Algorithms.

Journal: Scientific reports
PMID:

Abstract

The aim of this project was to identify candidate novel therapeutic targets to facilitate the treatment of COPD using machine-based learning (ML) algorithms and penalized regression models. In this study, 59 healthy smokers, 53 healthy non-smokers and 21 COPD smokers (9 GOLD stage I and 12 GOLD stage II) were included (n = 133). 20,097 probes were generated from a small airway epithelium (SAE) microarray dataset obtained from these subjects previously. Subsequently, the association between gene expression levels and smoking and COPD, respectively, was assessed using: AdaBoost Classification Trees, Decision Tree, Gradient Boosting Machines, Naive Bayes, Neural Network, Random Forest, Support Vector Machine and adaptive LASSO, Elastic-Net, and Ridge logistic regression analyses. Using this methodology, we identified 44 candidate genes, 27 of these genes had been previously been reported as important factors in the pathogenesis of COPD or regulation of lung function. Here, we also identified 17 genes, which have not been previously identified to be associated with the pathogenesis of COPD or the regulation of lung function. The most significantly regulated of these genes included: PRKAR2B, GAD1, LINC00930 and SLITRK6. These novel genes may provide the basis for the future development of novel therapeutics in COPD and its associated morbidities.

Authors

  • Shayan Mostafaei
    Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
  • Anoshirvan Kazemnejad
    Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran. kazem_an@modares.ac.ir.
  • Sadegh Azimzadeh Jamalkandi
    Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
  • Soroush Amirhashchi
    Department of Actuarial Science, Faculty of Mathematical Science, Shahid Beheshti University, Tehran, Iran.
  • Seamas C Donnelly
    Department of Clinical Medicine, School of Medicine, Trinity Biomedical Sciences Institute, Trinity College Dublin, Dublin 2, Ireland.
  • Michelle E Armstrong
    Department of Clinical Medicine, School of Medicine, Trinity Biomedical Sciences Institute, Trinity College Dublin, Dublin 2, Ireland.
  • Mohammad Doroudian
    Department of Clinical Medicine, School of Medicine, Trinity Biomedical Sciences Institute, Trinity College Dublin, Dublin 2, Ireland.