Identification of biomarkers for knee osteoarthritis through clinical data and machine learning models.

Journal: Scientific reports
PMID:

Abstract

Knee osteoarthritis (KOA) represents a progressive degenerative disorder characterized by the gradual erosion of articular cartilage. This study aimed to develop and validate biomarker-based predictive models for KOA diagnosis using machine learning techniques. Clinical data from 2594 samples were obtained and stratified into training and validation datasets in a 7:3 ratio. Key clinical features were identified through differential analysis between KOA and control groups, combined with least absolute shrinkage and selection operator (LASSO) regression. The SHapley Additive Planning (SHAP) method was employed to rank feature importance quantitatively. Based on these rankings, predictive models were constructed using Logistic Regression (LR), Random Forest (RF), eXtreme Gradient Boosting (xGBoost), Naive Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT) algorithms. Models were developed for subsets of variables, including the top 5, top 10, top 15, and all identified features. Receiver operating characteristic (ROC) curves were applied to compare diagnostic performance across models. Additionally, a risk stratification framework for KOA prediction was designed using recursive partitioning analysis (RPA). Using difference analysis and LASSO, 44 critical clinical features were identified. Among these, age, plasma prothrombin time, gender, body mass index (BMI), and prothrombin time and international normalized ratio (PTINR) emerged as the top five features, with SHAP values of 0.1990, 0.0981, 0.0471, 0.0433, and 0.0422, respectively. Machine learning analysis demonstrated that these variables provided robust diagnostic performance for KOA. In the training set, area under the curve (AUC) values for LR, RF, xGBoost, NB, SVM, and DT models were 0.947, 0.961, 0.892, 0.952, 0.885, and 0.779, respectively. Similarly, in the validation dataset, these models achieved AUC values of 0.961, 0.943, 0.789, 0.957, 0.824, and 0.76. Among them, RF consistently exhibited superior diagnostic accuracy for KOA. Additionally, RPA analysis indicated a higher prevalence of KOA among individuals aged 54 years and older. The integration of the top five clinical variables significantly enhanced the diagnostic accuracy for KOA, particularly when employing the RF model. Moreover, the RPA model offered valuable insights to assist clinicians in refining prognostic assessments and optimizing clinical decision-making processes.

Authors

  • Wei Chen
    Department of Urology, Zigong Fourth People's Hospital, Sichuan, China.
  • Haotian Zheng
    State Key Joint Laboratory of Environmental Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing 100084, China.
  • Binglin Ye
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.
  • Tiefeng Guo
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.
  • Yude Xu
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.
  • Zhibin Fu
    Department of Urology, Changzheng Hospital, Naval Military Medical University, Shanghai, P.R. China.
  • Xing Ji
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.
  • Xiping Chai
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.
  • Shenghua Li
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China. lish0619@163.com.
  • Qiang Deng
    Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China. 1031518835@qq.com.