Ensemble machine learning-based pre-trained annotation approach for scRNA-seq data using gradient boosting with genetic optimizer.

Journal: BMC bioinformatics
Published Date:

Abstract

Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of gene expression by allowing researchers to analyze the transcriptomes of individual cells. This technology provides unprecedented insights into cellular heterogeneity, cellular states, and biological processes at a single-cell resolution. The problem of single-cell RNA annotation involves assigning meaningful labels or annotations to each cell in the scRNA-seq dataset, indicating its corresponding cell type, state, or biological function. Current annotation methods are often challenged by limited source data quality, sensitivity to batch effects, and poor adaptability to uncharacterized cell types. We propose an ensemble machine learning-based pre-trained annotation framework that integrates gradient boosting and genetic optimization for robust feature selection. The proposed method uses ensemble learning to enhance annotation accuracy under data scarcity, addressing limitations in existing supervised methods by leveraging a combination of multiple annotated datasets and feature alignment strategies. Through comprehensive benchmarking across varied biological contexts, we demonstrate that the proposed approach significantly improves annotation accuracy and generalization across different scRNA-seq platforms, especially under conditions of reduced reference data. Results confirm its versatility and resilience in accurately annotating cell types, even under reduced data conditions, establishing it as a powerful tool for cell-type classification in scRNA-seq data.

Authors

  • Osama Elnahas
    School of Mathematical Sciences, Shenzhen University, Shenzhen, 518000, China.
  • Waleed M Ead
    Faculty of Computing and Information, Al-Baha University, Al-Baha, Saudi Arabia.
  • Yushan Qiu
    College of Mathematics and Statistics, Shenzhen University, Nanhai Avenue 3688, Shenzhen, 518060, China. yushan.qiu@szu.edu.cn.
  • Jian Lu
    Key Laboratory of Microelectronic Devices & Integrated Technology, Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China.