Tumor cell specific total mRNA expression informed neural networks predicts cancer progression
Journal:
bioRxiv
Published Date:
May 6, 2026
Abstract
Inferring tumor molecular phenotypes from high-dimensional multi-omic data is a fundamental challenge in computational biology. Current methods for estimating tumor cell-specific total mRNA expression (TmS) require matched DNA and RNA sequencing data and rely on computationally intensive deconvolution pipelines. We present TmSNet, a deep learning framework that predicts TmS using mRNA, DNA methylation, miRNA, and immune cell proportions as input features. TmSNet integrates structured feature selection (gradient boosting, LASSO, elastic net) with specialized neural architectures to predict continuous TmS. Across 12 TCGA cancer types, TmSNet achieved cross-validated performance up to concordance correlation coefficient (CCC) = 0.93 and correlation R-squared = 0.88 and generalized to external cohorts with correlations of 0.54 (SCAN-B) and 0.43 (FUSCC). Predicted TmS values effectively stratify patients by risk and preserve known transcriptional profiles across tumor subtypes. These results demonstrate that TmSNet can infer biologically meaningful phenotypes from multi-omic data and provide a scalable framework for modeling tumor transcriptional activity in heterogeneous cohorts.