Analysis of diagnostic genes and molecular mechanisms of Crohn's disease and colon cancer based on machine learning algorithms.

Journal: Scientific reports
PMID:

Abstract

Crohn's disease (CD) is a chronic inflammatory bowel condition, and colon adenocarcinoma (COAD), as one of the most prevalent malignant tumors of the digestive tract, has been indicated by research to have a close association with CD. This study employs bioinformatics techniques to uncover the potential molecular links between CD and COAD. In this study, two data series related to CD were identified from the Gene Expression Omnibus (GEO) database under specific criteria, and relevant COAD gene data were obtained from The Cancer Genome Atlas (TCGA). Weighted Gene Co-expression Network Analysis (WGCNA), differentially expressed genes (DEGs), and protein-protein interaction (PPI) network analysis were conducted. A diagnostic model was established using machine learning. The accuracy of the diagnosis was validated using methods such as the construction of Receiver Operating Characteristic (ROC) curves and nomograms. Gene Set Enrichment Analysis (GSEA) was also employed to enrich the relevant pathways and biological processes. This study identified three genes through machine learning selection: DPEP1, MMP3, and MMP13. The ROC curves demonstrated that the machine learning model constructed with these three genes has a high level of accuracy, confirming their potential as biomarkers. Furthermore, GSEA elucidated that the pathways associated with these three key genes are closely related to cytokines and other factors. This study has identified key biomarker genes for CD and COAD: DPEP1, MMP3, and MMP13, providing additional molecular mechanism associations between the two diseases. It also offers more connections and pathways for reference regarding the progression of CD to COAD.

Authors

  • Jie Xiao
    Department of Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, China; Institute of Clinical Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, China.
  • Junyao Liang
    First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, 410007, Hunan, China.
  • Tao Zhou
    Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Man Zhou
    Department of Biomedical Engineering, College of Engineering, and Centre for Quantitative Biology, Peking University, Beijing, China.
  • Dexu Zhang
    Shandong Provincial Qianfoshan Hospital, Shandong University, Shandong, China.
  • Hui Feng
    School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, Guangdong, China.
  • Chusen Tang
    First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, 410007, Hunan, China.
  • Qian Zhou
    Department of Computer Science, City University of Hong Kong, Hong Kong, Hong Kong SAR, China.
  • Weiqing Yang
  • Xiaoqin Tan
    Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.
  • Wanjia Zhang
    First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, 410007, Hunan, China.
  • Yin Xu
    First Affiliated Hospital of Hunan University of Traditional Chinese Medicine, Changsha, 410007, Hunan, China. 311118@hnucm.edu.cn.