A deep ensemble framework for human essential gene prediction by integrating multi-omics data.
Journal:
Scientific reports
Published Date:
Jul 21, 2025
Abstract
Essential genes are necessary for the survival or reproduction of a living organism. The prediction and analysis of gene essentiality can advance our understanding of basic life and human diseases, and further boost the development of new drugs. We propose a snapshot ensemble deep neural network method, DeEPsnap, to predict human essential genes. DeEPsnap integrates the features derived from DNA and protein sequence data with the features extracted or learned from four types of functional data: gene ontology, protein complex, protein domain, and protein-protein interaction networks. More than 200 features from these biological data are extracted/learned which are integrated together to train a series of cost-sensitive deep neural networks. The proposed snapshot mechanism enables us to train multiple models without increasing extra training effort and cost. The experimental results of 10-fold cross-validation show that DeEPsnap can accurately predict human gene essentiality with an average AUROC of 96.16%, AUPRC of 93.83%, and accuracy of 92.36%. The comparative experiments show that DeEPsnap outperforms several popular traditional machine learning models and deep learning models, while all those models show promising performance using the features we created for DeEPsnap. We demonstrated that the proposed method, DeEPsnap, is effective for predicting human essential genes.