A Context-Specific, Literature-Supported Framework for Validating Stress Response Models in Mammals
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
Computational models of stress responses can highlight candidate genes underlying physiological adaptation, but their utility depends on rigorous validation. Using existing biological databases, we tested a novel approach to identify and group differentially expressed genes (DEGs) from RNA-seq data. The approach was used to build a neural network to model the epigenetic response of human cells to temperature stress. A Key-Response, Treatment Specific, Support, and Noisy group of genes were identified as representing the Principal Response of the species to the stress. The Support Group was also suspected to represent housekeeping genes based on variability patterns. To validate these assumptions, we built protein–protein interaction (PPI) networks using the Human Protein Atlas and STRING-db, incorporating both direct and second-order connections. Crucially, second-order connections were restricted to those made via DEGs, ensuring that connectivity reflected condition-specific stress responses rather than generic hubs. Across two conditions, >75% of Principal Response genes assembled into LCCs significantly larger than random networks (p<0.00005). Support Group genes also showed strong connectivity and enriched overlap with a housekeeping gene database, supporting their distinct classification. STRING-db confirmed PPI enrichment but produced less reliable results than our DEG-restricted framework. Overall, this study demonstrates that the model identifies biologically meaningful, interconnected stress-response networks. By emphasizing DEG-restricted second-order connections, our framework addresses key limitations of context-free enrichment methods and advances validation strategies for computational models of gene regulation, with implications for stress physiology and personalized medicine.