FactVAE: a factorized variational autoencoder for single-cell multi-omics data integration analysis.
Journal:
Briefings in bioinformatics
PMID:
40211981
Abstract
Single-cell multi-omics technologies have revolutionized the study of cell states and functions by simultaneously profiling multiple molecular layers within individual cells. However, existing methods for integrating these data struggle to preserve critical feature information and fail to exploit known regulatory knowledge, which is essential for understanding cell functions. This limitation hinders their ability to provide comprehensive and accurate insights into cells. Here, we propose FactVAE, an innovative factorized variational autoencoder designed for the robust and accurate understanding of single-cell multi-omics data. FactVAE integrates the factorization principle into the variational autoencoder framework, ensuring the preservation of feature information while leveraging the non-linear capture of sample information by neural networks. Additionally, known regulatory knowledge is incorporated during model training, and a knowledge transfer strategy is employed for cell embedding optimization and data augmentation. Comparative analyses of single-cell multi-omics datasets from different protocols and the spatial multi-omics dataset demonstrate that FactVAE not only outperforms benchmark methods in clustering performance but also generates augmented data that reveals the clearest cell-type-specific motif expression. Moreover, the feature embeddings captured by FactVAE enable the inference of potential and reliable gene regulatory relationships. Overall, FactVAE's superior performance and strong scalability make it a promising new solution for single-cell multi-omics data analysis.