Ensemble machine learning-based pre-trained annotation approach for scRNA-seq data using gradient boosting with genetic optimizer.

Journal: BMC bioinformatics

Published Date: Jul 1, 2025

Abstract

Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of gene expression by allowing researchers to analyze the transcriptomes of individual cells. This technology provides unprecedented insights into cellular heterogeneity, cellular states, and biological processes at a single-cell resolution. The problem of single-cell RNA annotation involves assigning meaningful labels or annotations to each cell in the scRNA-seq dataset, indicating its corresponding cell type, state, or biological function. Current annotation methods are often challenged by limited source data quality, sensitivity to batch effects, and poor adaptability to uncharacterized cell types. We propose an ensemble machine learning-based pre-trained annotation framework that integrates gradient boosting and genetic optimization for robust feature selection. The proposed method uses ensemble learning to enhance annotation accuracy under data scarcity, addressing limitations in existing supervised methods by leveraging a combination of multiple annotated datasets and feature alignment strategies. Through comprehensive benchmarking across varied biological contexts, we demonstrate that the proposed approach significantly improves annotation accuracy and generalization across different scRNA-seq platforms, especially under conditions of reduced reference data. Results confirm its versatility and resilience in accurately annotating cell types, even under reduced data conditions, establishing it as a powerful tool for cell-type classification in scRNA-seq data.

Authors

Osama Elnahas

School of Mathematical Sciences, Shenzhen University, Shenzhen, 518000, China.
Waleed M Ead

Faculty of Computing and Information, Al-Baha University, Al-Baha, Saudi Arabia.
Yushan Qiu

College of Mathematics and Statistics, Shenzhen University, Nanhai Avenue 3688, Shenzhen, 518060, China. yushan.qiu@szu.edu.cn.
Jian Lu

Key Laboratory of Microelectronic Devices & Integrated Technology, Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China.

Keywords

Algorithms Humans Machine Learning Molecular Sequence Annotation RNA-Seq Sequence Analysis, RNA Single-Cell Analysis Single-Cell Gene Expression Analysis

External Resources

View on PubMed Access via DOI PubMed (40596854)

Ensemble machine learning-based pre-trained annotation approach for scRNA-seq data using gradient boosting with genetic optimizer.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Ensemble machine learning-based pre-trained annotation approach for scRNA-seq data using gradient boosting with genetic optimizer.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals