Machine learning based screening of biomarkers associated with cell death and immunosuppression of multiple life stages sepsis populations.

Journal: Scientific reports
Published Date:

Abstract

Sepsis is a condition resulting from the uncontrolled immune response to infection, leading to widespread inflammatory damage and potentially fatal organ dysfunction. Currently, there is a lack of specific prevention and treatment strategies for sepsis across different age groups. Programmed Cell Death (PCD) can regulate the enrichment of effector immune cells or regulatory immune cells, providing a new perspective for immunotherapy. Within the framework of computational biology and machine learning strategies, and against the backdrop of global multicenter sepsis cohort data, this study aims to deeply mine and screen specific biomarkers related to the immune microenvironment and programmed cell death in populations across different life stages (neonates, children, and adults). This will provide foundational data for precision treatment and drug development in artificial intelligence-assisted sepsis diagnosis and treatment management. Gene expression data from sepsis patients across global multicenter populations, including China, Europe, and the United States, were obtained from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) were identified. A literature review was conducted to obtain 18 PCD-related genes, which were intersected with DEGs to identify DEGs associated with specific types of PCD. Nine machine learning algorithms (Logistic Regression LR, Decision Tree DT, Gradient Boosting Machine GBM, K-Nearest Neighbors KNN, LASSO, Principal Component Analysis PCA, Random Forest RF, Support Vector Machine SVM, and XGBoost) were applied to training and testing datasets with 10-fold cross-validation to select three optimized algorithm models. The SHAP algorithm was further used to quantify the contribution of each gene based on cell death features to the prediction of sepsis. Key PCD patterns were identified based on model evaluation metrics (Accuracy, Precision, Recall, F1 score, and Receiver Operating Characteristic Curve ROC), and their associated DEGs were obtained through intersection, followed by immune-related analysis of DEGs. The study included a total of 1507 sepsis cases and 484 controls globally, with 90 neonatal cases and 95 controls, 527 children cases and 101 controls, and 890 adult cases and 288 controls. The best model for predicting sepsis across different populations was GBM.The key PCD patterns selected by machine learning for different age groups were Pyroptosis (neonates), Ferroptosis (children), and Autophagy (adults). (1) In neonatal sepsis, the models constructed by GBM, XGBoost, and RF algorithms performed the best, and identified 5 key DEGs associated with Pyroptosis (CHMP7, NLRC4, AIM2, GZMB, PRKACA), with NLRC4 showing the best predictive ability (AUC = 0.902, P < 0.05), significantly positively correlated with neutrophils and negatively correlated with CD8 + T cells. (2) In the children sepsis population, models constructed using the Gradient Boosting Machine (GBM), Support Vector Machine (SVM), and Least Absolute Shrinkage and Selection Operator (LASSO) algorithms demonstrated the best performance. Six key DEGs associated with Ferroptosis were identified (AKR1C3, GCLM, PEBP1, CARS, MAP1LC3B, SCL11A2), among which MAP1LC3B, playing a role in mitochondrial reactive oxygen species energy metabolism, showed the strongest predictive ability (AUC = 0.883, P < 0.05). It was significantly positively correlated with M0-type macrophages and significantly negatively correlated with activated CD4 + memory T cells. (3) In the adult sepsis population, models constructed using GBM, SVM, and LASSO algorithms showed the best performance. Three key DEGs associated with Autophagy were identified (TSPO, HTRA2, USP10), with TSPO, which mediates oxidative stress regulation, iron homeostasis, and cholesterol transport, showing the strongest predictive ability (AUC = 0.825, P < 0.05). It was significantly positively correlated with M1-type macrophages and significantly negatively correlated with CD8 + T cells. This study, through the integrated application of computational biology and machine learning algorithms, discovered biomarkers of PCD patterns that affect cytokine storm-mediated inflammation and immunosuppressive effects in sepsis populations across different age groups (neonates, children, and adults). These findings have specific clinical application and drug development value, providing a scientific basis for the global application of artificial intelligence-assisted sepsis diagnosis and treatment management.

Authors

  • Jie Yang
    Key Laboratory of Development and Maternal and Child Diseases of Sichuan Province, Department of Pediatrics, Sichuan University, Chengdu, China.
  • Fanyan Ou
    Department of Clinical Pathology, the Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Binbin Li
    Department of Ophthalmology, Ganzhou people's Hospital, Ganzhou, China.
  • Lixiong Zeng
    Department of Clinical Pathology, the Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Qiuli Chen
    Department of Clinical Pathology, the Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Houyu Gan
    Department of Clinical Pathology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Jianing Yu
    Department of Pulmonary and Critical Care Medicine, The Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Qian Guo
    State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China.
  • Jihua Feng
    Yunnan Minzu University, Kunming, China.
  • Jianfeng Zhang
    Department of Vascular Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, P.R. China.