Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets.

Journal: BMC bioinformatics

Published Date: Aug 4, 2025

Abstract

As single-cell sequencing technology became widely used, scientists found that single-modality data alone could not fully meet the research needs of complex biological systems. To address this issue, researchers began simultaneously collect multi-modal single-cell omics data. But different sequencing technologies often result in datasets where one or more data modalities are missing. Therefore, mosaic datasets are more common when we analyze. However, the high dimensionality and sparsity of the data increase the difficulty, and the presence of batch effects poses an additional challenge. To address these challenges, we proposes a flexible integration framework based on Variational Autoencoder called scGCM. The main task of scGCM is to integrate single-cell multimodal mosaic data and eliminate batch effects. This method was conducted on multiple datasets, encompassing different modalities of single-cell data. The results demonstrate that, compared to state-of-the-art multimodal data integration methods, scGCM offers significant advantages in clustering accuracy and data consistency. The source code of scGCM can be accessed at https://github.com/closmouz/scCGM .

Authors

Zihao Wang

Department of Neurosurgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China.
Zeyu Wu

School of Food and Biological Engineering, Hefei University of Technology, Hefei 230601, China; Engineering Research Center of Bio-Process, Ministry of Education, Hefei University of Technology, Hefei 230601, China. Electronic address: wuzeyu@hfut.edu.cn.
Minghua Deng

Center for Quantitative Biology, Peking University, Beijing, China. dengmh@pku.edu.cn.

Keywords

Algorithms Autoencoder Cluster Analysis Computational Biology Humans Single-Cell Analysis Software Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (40759922)

Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals