Machine learning-based analysis identifies and validates serum exosomal proteomic signatures for the diagnosis of colorectal cancer.

Journal: Cell reports. Medicine
Published Date:

Abstract

The potential of serum extracellular vesicles (EVs) as non-invasive biomarkers for diagnosing colorectal cancer (CRC) remains elusive. We employed an in-depth 4D-DIA proteomics and machine learning (ML) pipeline to identify key proteins, PF4 and AACT, for CRC diagnosis in serum EV samples from a discovery cohort of 37 cases. PF4 and AACT outperform traditional biomarkers, CEA and CA19-9, detected by ELISA in 912 individuals. Furthermore, we developed an EV-related random forest (RF) model with the highest diagnostic efficiency, achieving AUC values of 0.960 and 0.963 in the train and test sets, respectively. Notably, this model demonstrated reliable diagnostic performance for early-stage CRC and distinguishing CRC from benign colorectal diseases. Additionally, multi-omics approaches were employed to predict the functions and potential sources of serum EV-derived proteins. Collectively, our study identified the crucial proteomic signatures in serum EVs and established a promising EV-related RF model for CRC diagnosis in the clinic.

Authors

  • Haofan Yin
    Digestive Diseases Center, The Seventh Affiliated Hospital of Sun Yat-Sen University, Shenzhen, Guangdong, China; Department of Laboratory Medicine, Shenzhen People's Hospital (The Second Clinical Medical College, Jinan University, The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, Guangdong, China.
  • Jinye Xie
    Department of Laboratory Medicine, Zhongshan City People's Hospital, Zhongshan, Guangdong, China.
  • Shan Xing
    US Medical, Takeda Pharmaceuticals USA Inc, Lexington, MA, USA.
  • Xiaofang Lu
    Tianjin Aoqun Sheep Industry Academy Limited, Tianjin, China.
  • Yu Yu
    Department of Breast Surgery, Shen Shan Medical Center, Memorial Hospital of Sun Yat-Sen University, Shanwei, Guangdong, China.
  • Yong Ren
    Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River Delta, Guangzhou, China.
  • Jian Tao
    Visual Computing & Computational Media, College of Performance, Visualization & Fine Arts, Texas A&M University, College Station, Texas.
  • Guirong He
    Department of Laboratory Medicine, Shenzhen People's Hospital (The Second Clinical Medical College, Jinan University, The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, Guangdong, China.
  • Lijun Zhang
    Department of Paediatric Orthopaedics, Shengjing Hospital of China Medical University, Shenyang, Liaoning Province, China.
  • Xiaopeng Yuan
    Department of Laboratory Medicine, Shenzhen People's Hospital (The Second Clinical Medical College, Jinan University, The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, Guangdong, China. Electronic address: yuanxp2001@126.com.
  • Zheng Yang
    Sichuan University - Pittsburgh Institute (SCUPI), Sichuan University, Chengdu, 610207, China.
  • Zhijian Huang
    School of Computer Science and Engineering, Central South University, 410075, Changsha, China.