Machine Learning Approaches on High Throughput NGS Data to Unveil Mechanisms of Function in Biology and Disease.

Journal: Cancer genomics & proteomics
Published Date:

Abstract

In this review, the fundamental basis of machine learning (ML) and data mining (DM) are summarized together with the techniques for distilling knowledge from state-of-the-art omics experiments. This includes an introduction to the basic mathematical principles of unsupervised/supervised learning methods, dimensionality reduction techniques, deep neural networks architectures and the applications of these in bioinformatics. Several case studies under evaluation mainly involve next generation sequencing (NGS) experiments, like deciphering gene expression from total and single cell (scRNA-seq) analysis; for the latter, a description of all recent artificial intelligence (AI) methods for the investigation of cell sub-types, biomarkers and imputation techniques are described. Other areas of interest where various ML schemes have been investigated are for providing information regarding transcription factors (TF) binding sites, chromatin organization patterns and RNA binding proteins (RBPs), while analyses on RNA sequence and structure as well as 3D dimensional protein structure predictions with the use of ML are described. Furthermore, we summarize the recent methods of using ML in clinical oncology, when taking into consideration the current omics data with pharmacogenomics to determine personalized treatments. With this review we wish to provide the scientific community with a thorough investigation of main novel ML applications which take into consideration the latest achievements in genomics, thus, unraveling the fundamental mechanisms of biology towards the understanding and cure of diseases.

Authors

  • Vasileios C Pezoulas
    Unit of Medical Technology and Intelligent Information Systems, Dept. of Material Science and Engineering, University of Ioannina, GR45110, Ioannina, Greece.
  • Orsalia Hazapis
    Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.
  • Nefeli Lagopati
    Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.
  • Themis P Exarchos
    Institute of Communication and Computer Systems, National Technical University of Athens, Athens, Greece, Themis.exarchos@gmail.com.
  • Andreas V Goules
    Department of Pathophysiology and Joint Rheumatology, Medical School, National and Kapodistrian University of Athens, Greece; Research Institute for Systemic Autoimmune Diseases, Greece. Electronic address: agoules@med.uoa.gr.
  • Athanasios G Tzioufas
    Department of Pathophysiology and Joint Rheumatology, Medical School, National and Kapodistrian University of Athens, Greece; Biomedical Research Foundation of the Academy of Athens, Greece; Research Institute for Systemic Autoimmune Diseases, Greece.
  • Dimitrios I Fotiadis
    Biomedical Research Institute, Foundation for Research and Technology Hellas, Greece; Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Greece.
  • Ioannis G Stratis
    Department of Mathematics, National and Kapodistrian University of Athens, Athens, Greece.
  • Athanasios N Yannacopoulos
    Department of Statistics, Athens University of Economics & Business, Athens, Greece.
  • Vassilis G Gorgoulis
    Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou Str., Athens GR-11527, Greece; Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Str, Athens GR-11527, Greece; Division of Cancer Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, Manchester Cancer Research Centre, NIHR Manchester Biomedical Research Centre, University of Manchester, Manchester M20 4GJ, UK; Center for New Biotechnologies and Precision Medicine, Medical School, National and Kapodistrian University of Athens, 75 Mikras Asias Str, Athens GR-11527, Greece. Electronic address: vgorg@med.uoa.gr.