Machine Learning Reduced Gene/Non-Coding RNA Features That Classify Schizophrenia Patients Accurately and Highlight Insightful Gene Clusters.

Journal: International journal of molecular sciences
PMID:

Abstract

RNA-seq has been a powerful method to detect the differentially expressed genes/long non-coding RNAs (lncRNAs) in schizophrenia (SCZ) patients; however, due to overfitting problems differentially expressed targets (DETs) cannot be used properly as biomarkers. This study used machine learning to reduce gene/non-coding RNA features. Dorsolateral prefrontal cortex (dlpfc) RNA-seq data from 254 individuals was obtained from the CommonMind consortium. The average predictive accuracy for SCZ patients was 67% based on coding genes, and 96% based on long non-coding RNAs (lncRNAs). Machine learning is a powerful algorithm to reduce functional biomarkers in SCZ patients. The lncRNAs capture the characteristics of SCZ tissue more accurately than mRNA as the former regulate every level of gene expression, not limited to mRNA levels.

Authors

  • Yichuan Liu
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Hui-Qi Qu
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Xiao Chang
    Department of Radiation Oncology, School of Medicine, Washington University in Saint Louis, St.Louis, MO, 63110, USA.
  • Lifeng Tian
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Jingchun Qu
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Joseph Glessner
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Patrick M A Sleiman
    Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
  • Hakon Hakonarson
    The Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.