Predicting the genetic component of gene expression using gene regulatory networks
Journal:
arXiv
Published Date:
Aug 16, 2024
Abstract
Gene expression prediction plays a vital role in transcriptome-wide
association studies (TWAS), which seek to establish associations between tissue
gene expression and complex traits. Traditional models rely on genetic variants
in close genomic proximity to the gene of interest to predict the genetic
component of gene expression. In this study, we propose a novel approach
incorporating distal genetic variants acting through gene regulatory networks
(GRNs) into gene expression prediction models, in line with the omnigenic model
of complex trait inheritance. Using causal and coexpression GRNs reconstructed
from genomic and transcriptomic data and modeling the data as a Bayesian
network jointly over genetic variants and genes, inference of gene expression
from observed genotypic data is achieved through a two-step process. Initially,
the expression level of each gene in the network is predicted using its local
genetic variants. The residuals, calculated as the differences between the
observed and predicted expression levels, are then modeled using the genotype
information of parent and/or grandparent nodes in the GRN. The final predicted
expression level of the gene is obtained by summing the predictions from the
local variants model and the residual model, effectively incorporating both
local and distal genetic influences. Using various regularized regression
techniques for parameter estimation, we found that GRN-based gene expression
prediction outperformed the traditional local-variant approach on simulated
data from the DREAM5 Systems Genetics Challenge and real data from the Geuvadis
study and an eQTL mapping study in yeast. This study provides important
insights into the challenge of gene expression prediction for TWAS. It
reaffirms the importance of GRNs for understanding the genetic effects on gene
expression and complex traits more generally.