AGEP_TWAS: A Deep Learning-based Framework for Predicting Gene Expression Levels in Tissues.

Journal: IEEE transactions on computational biology and bioinformatics
Published Date:

Abstract

Accurate prediction of gene expression levels across different tissues is of great significance in understanding the functional roles of genes in various biological processes and assisting in transcriptome-wide association studies (TWAS). Traditional methods rely on the construction of a prediction model for each gene in a specific tissue, which is time-consuming and inefficient when dealing with numerous genes or tissues. In addition, current approaches do not consider missing single nucleotide polymorphisms (SNPs) in the training population. These SNPs significantly affect gene expression levels and in turn limit the predictive capability of these approaches in new samples. Recent research indicates that by identifying a specific group of core genes (known as landmark genes) that accurately reflect the cellular states of samples across different experimental conditions, it is possible to predict the expression levels of other genes in the genome. In light of this, we propose AGEP_TWAS (Adaptive Gene Expression Predictor for TWAS), a gene expression prediction method that utilizes a dense connection network, adaptive activation functions, and parameter pruning strategies within a nonlinear feature extraction computational framework. AGEP_TWAS leverages landmark genes within a tissue to predict the expression levels of other genes that are challenging to predict using traditional methods. Results on the human GEO expression dataset demonstrate that AGEP_TWAS achieved a mean squared error (MSE) of 0.1821 and a Pearson correlation coefficient (PCC) of 0.9004, outperforming existing state-of-the-art prediction models. Additionally, when applied to the CattleGTEx dataset to infer gene expression levels across different tissues in cattle, AGEP_TWAS exhibited superior predictive performance compared to existing methods. A TWAS on milk production traits in cattle highlights the practical utility of AGEP_TWAS, with six identified significant genes already reported in scientific literature to be associated with milk production traits.

Authors

Keywords

No keywords available for this article.