Machine learning-based identification of diagnostic and prognostic mitotic cell cycle genes in hepatocellular carcinoma.

Journal: PloS one
Published Date:

Abstract

Mitotic cell cycle (MCC) is a critical process in cell growth and division, and dysregulation of MCC genes may contribute to tumorigenesis. In this study, to identify diagnostic and prognostic value of MCC genes, differentially expressed MCC genes between HCC and normal tissues were identified and subjected to machine learning methods. SVM-RFE and RF-RFE were employed to select the most informative diagnostic genes. The SVM-RFE model demonstrated high performance in TCGA (AUC = 1.0), and generalizability across GSE77509 (AUC = 0.95) and GSE144269 (AUC = 0.879), outperforming RF-RFE. Permutation testing confirmed that these AUCs were outside the null distribution for all datasets. Nine genes, CDKN3, TRIP13, RACGAP1, FBXO43, EZH2, SPDL1, E2F1, TUBE1 and CDC6, were common in SVM-RFE and RF-RFE and showed robust individual diagnostic performance across datasets (AUCs > 0.81). Univariate Cox regression followed by LASSO Cox regression was used for identification of prognostic gene signature consisted of eight MCC genes, BCAT1, DPF1, CDKN2B, CDKN2C, TUBA3C, IGF1, CDC14B and SMARCA2, that predicted overall survival of HCC patients. The risk score was shown to be an independent prognostic factor for HCC and its combination with AJCC stage improved prognostic value. Kaplan-Meier analysis showed that high-risk score was associated to poorer survival across clinical subgroups; stage, grade, age, and gender. Additionally, risk score was significantly higher in patients with advanced-stage and high-grade tumors. In conclusion, diagnostic biomarker candidates classifying HCC patients and healthy controls, and a novel prognostic gene signature predicting overall survival of HCC patients were identified by using machine learning approaches.

Authors

  • Ceren Sucularli
    Department of Bioinformatics, Hacettepe University, Turkiye.