Distinguishing Rectal Cancer from Colon Cancer Based on the Support Vector Machine Method and RNA-sequencing Data.

Journal: Current medical science
Published Date:

Abstract

Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide. Several studies have indicated that rectal cancer is significantly different from colon cancer in terms of treatment, prognosis, and metastasis. Recently, the differential mRNA expression of colon cancer and rectal cancer has received a great deal of attention. The current study aimed to identify significant differences between colon cancer and rectal cancer based on RNA sequencing (RNA-seq) data via support vector machines (SVM). Here, 393 CRC samples from the The Cancer Genome Atlas (TCGA) database were investigated, including 298 patients with colon cancer and 95 with rectal cancer. Following the random forest (RF) analysis of the mRNA expression data, 96 genes such as HOXB13, PRAC, and BCLAF1 were identified and utilized to build the SVM classification model with the Leave-One-Out Cross-validation (LOOCV) algorithm. In the training (n=196) and the validation cohorts (n=197), the accuracy (82.1 % and 82.2 %, respectively) and the AUC (0.87 and 0.91, respectively) indicated that the established optimal SVM classification model distinguished colon cancer from rectal cancer reasonably. However, additional experiments are required to validate the predicted gene expression levels and functions.

Authors

  • Yan Zhang
    Affiliated Hospital of Liaoning University of Traditional Chinese Medicine, Shenyang, 110032, China.
  • Yuan Wu
    State Key Laboratory of Precision Spectroscopy, Quantum Institute for Light and Atoms, Department of Physics and Electronic Science, East China Normal University, Shanghai 200062, China.
  • Zi-Ying Gong
    Shanghai Yunying Medical Technology Co., Ltd., Shanghai, 201612, China.
  • Hai-Dan Ye
    Shanghai Yunying Medical Technology Co., Ltd., Shanghai, 201612, China.
  • Xiao-Kai Zhao
    Shanghai Yunying Medical Technology Co., Ltd., Shanghai, 201612, China.
  • Jie-Yi Li
    Shanghai Yunying Medical Technology Co., Ltd., Shanghai, 201612, China.
  • Xiao-Mei Zhang
    Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, 210009, China.
  • Sheng Li
    School of Data Science, University of Virginia, Charlottesville, VA, United States.
  • Wei Zhu
    The Second Clinical College of Guangzhou University of Chinese Medicine, Guangzhou University of Chinese Medicine Guangzhou 510120 China zhuwei9201@163.com.
  • Mei Wang
    Natural Products Utilization Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Oxford, MS, 38677, USA.
  • Ge-Yu Liang
    School of Public Health, Southeast University, Nanjing, 211189, China.
  • Yun Liu
    Google Health, Palo Alto, CA USA.
  • Xin Guan
    Guangzhou Xinhua University, Dongguan, China.
  • Dao-Yun Zhang
    Shanghai Yunying Medical Technology Co., Ltd., Shanghai, 201612, China.
  • Bo Shen
    School of Information Science and Technology, Donghua University, Shanghai 200051, China. Electronic address: Bo.Shen@dhu.edu.cn.