Identification of biomarkers for acute leukemia via machine learning-based stemness index.

Journal: Gene
PMID:

Abstract

Traditional methods to understand leukemia stem cell (LSC)'s biological characteristics include constructing LSC-like cells and mouse models by transgenic or knock-in methods. However, there are some potential pitfalls in using this method, such as retroviral insertion mutagenesis, non-physiological level gene expression, non-physiological expansion, and difficulty to construct. The mRNAsi index for each sample of the Cancer Genome Atlas (TCGA) could avoid these potential pitfalls by machine learning. In this work, we aimed to construct a network of LSC genes utilizing the mRNAsi. First, mRNAsi value was analyzed with expressions distributions, survival analysis, age, and gender in acute myeloid leukemia (AML) samples. Then, we used the weighted gene co-expression network analysis (WGCNA) to construct modules of stemness genes. The correlation of the LSC genes transcription and interplay among LSC proteins was analyzed. We performed functional and pathway enrichment analysis to annotate stemness genes. Survival analysis further identified prognostic biomarkers by clinical data of TCGA and the Gene Expression Omnibus (GEO) database. We found that the result of mRNAsi overall survival is not significant, which may be due to the heterogeneity of AML in the stage of myeloid differentiation, French-American-British (FAB) classification systems. Enrichment analysis indicated that the stemness genes were biologically clustered as a group and mainly associated with cell cycle and mitosis. Moreover, 10 key genes (SNRNP40, RFC4, RFC5, CDC6, HSPE1, PA2G4, SNAP23P, DARS2, MIS18A, and HPRT1) were screened by survival analysis with the data from TCGA and GEO. Among them, RFC4 and RFC5 were the distinguished biomarkers for their double-validated prognostic value in both databases. Additionally, the expression of RFC4 and RFC5 had the same trend as mRNAsi score in FAB subtypes. In conclusion, our result demonstrated that mRNAsi based LSC-related genes were found to have strong interactions as a cluster. These genes, especially RFC4 and RFC5, could be the therapeutic targets for inhibiting the stemness characteristics of AML. This work is also a comprehensive pipeline for future cancer stem cell studies.

Authors

  • Yitong Zhang
    Department of Biochemistry and Molecular Biology, Harbin Medical University, Harbin 150081, China.
  • Dongzhe Liu
    Department of Biochemistry and Molecular Biology, Harbin Medical University, Harbin 150081, China; Department of Hematology and Oncology, International Cancer Center, Shenzhen Key Laboratory, Shenzhen University General Hospital, Shenzhen University Clinical Medical Academy, Shenzhen University Health Science Center, Xueyuan AVE 1098, Shenzhen 518000, China.
  • Fenglan Li
    Department of Biochemistry and Molecular Biology, Harbin Medical University, Harbin 150081, China.
  • Zihui Zhao
    Department of Biochemistry and Molecular Biology, Harbin Medical University, Harbin 150081, China.
  • Xiqing Liu
    The State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China.
  • Dixiang Gao
    School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.
  • Yutong Zhang
    Department of Biochemistry and Molecular Biology, Harbin Medical University, Harbin 150081, China.
  • Hui Li
    Department of Ophthalmology, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China.