Prediction of future gene expression profile by analyzing its past variation pattern.
Journal:
Gene expression patterns : GEP
PMID:
33444808
Abstract
A number of initial Hematopoietic Stem Cells (HSC) are considered in a container that are able to divide into HSCs or differentiate into various types of descendant cells. In this paper, a method is designed to predict an approximate gene expression profile (GEP) for future descendant cells resulted from HSC division/differentiation. First, the GEP prediction problem is modeled into a multivariate time series prediction problem. A novel method called EHSCP (Extended Hematopoietic Stem Cell Prediction) is introduced which is an artificial neural machine to solve the problem. EHSCP accepts the initial sequence of measured GEPs as input and predicts GEPs of future descendant cells. This prediction can be performed for multiple stages of cell division/differentiation. EHSCP considers the GEP sequence as time series and computes correlation between input time series. Two novel artificial neural units called PLSTM (Parametric Long Short Term Memory) and MILSTM (Multi-Input LSTM) are designed. PLSTM makes EHSCP able to consider this correlation in output prediction. Since there exist thousands of time series in GEP prediction, a hierarchical encoder is proposed that computes this correlation using 101 MILSTMs. EHSCP is trained using 155 datasets and is evaluated on 39 test datasets. These evaluations show that EHSCP surpasses existing methods in terms of prediction accuracy and number of correctly-predicted division/differentiation stages. In these evaluations, number of correctly-predicted stages in EHSCP was 128 when as many as 8 initial stages were given.