Accurate prediction of colorectal cancer diagnosis using machine learning based on immunohistochemistry pathological images.

Journal: Scientific reports
Published Date:

Abstract

Colorectal cancer (CRC) ranks as the third most prevalent tumor and the second leading cause of mortality. Early and accurate diagnosis holds significant importance in enhancing patient treatment and prognosis. Machine learning technology and bioinformatics have provided novel approaches for cancer diagnosis. This study aims to develop a CRC diagnostic model based on immunohistochemical staining image features using machine learning methods. Initially, CRC disease-specific genes were identified through bioinformatics analysis, SVM-RFE and Random Forest algorithm utilizing RNA-seq data from both GEO and TCGA databases. Subsequently, verification of these genes was performed using proteomics data from CPTAC and HPA database, resulting in identification of target proteins (AKR1B10, CA2, DHRS9, and ZG16) for further investigation. SVM and CNN were then employed to analyze and integrate the characteristics of immunohistochemical images to construct a reliable CRC diagnostic model. During the training and validation process of this model, cross-validation along with external validation methods were implemented to ensure accuracy and reliability. The results demonstrate that the established diagnostic model exhibits excellent performance in distinguishing between CRC and normal controls (accuracy rate: 0.999), thereby presenting potential prospects for clinical application. These findings are expected to provide innovative perspectives as well as methodologies for personalized diagnosis of CRC while offering more precise references for promising treatment.

Authors

  • Bobin Ning
    Department of General Surgery, the First Medical Centre, Chinese PLA General Hospital, Beijing, 100853, China.
  • Jimei Chi
    Key Laboratory of Green Printing, CAS Research/Education Center for Excellence in Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences (ICCAS), Beijing Engineering Research Center of Nanomaterials for Green Printing Technology, Beijing National Laboratory for Molecular Sciences (BNLMS), Beijing 100190, P. R. China.
  • Qingyu Meng
    Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi, China.
  • Baoqing Jia
    Department of General Surgery II, the First Medical Center of Chinese PLA General Hospital, Fuxing Road, Haidian District, Beijing, China. baoqingjia@126.com.