Probability Score for the Diagnosis of Periprosthetic Joint Infection: Development and Validation of a Practical Multi-analyte Machine Learning Model.
Journal:
Cureus
Published Date:
May 1, 2025
Abstract
Background and objective The diagnosis of periprosthetic joint infection (PJI) relies on established criteria-based systems requiring interpretation and combination of multiple laboratory tests into scoring systems. In routine clinical care, clinicians implement these algorithms to diagnose PJI. Existing literature indicates suboptimal adoption and implementation of these criteria in clinical practice, even among experts. Recognizing the need for accurate PJI diagnosis through proper synthesis of multiple laboratory parameters, this study aimed to develop and validate a machine learning (ML) model that generates a preoperative PJI probability score based solely on synovial fluid (SF) biomarkers within 24 hours. Materials and methods A two-stage ML model was constructed using 104,090 SF samples from 2,923 institutions (2018-2024). First, unsupervised learning identified natural clusters in the data to label samples as "infected" or "not infected." Then, these labels trained a supervised logistic regression model that generated PJI scores (0-100), categorizing cases as PJI positive (> 80), PJI negative (< 20), or equivocal (20-80). The model incorporated 10 SF biomarkers: specimen integrity markers (absorbance at 280 nm, red blood cell count), inflammatory markers (white blood cell count, percentage of neutrophils, SF C-reactive protein), a PJI-specific biomarker (alpha-defensin), and microbial antigen markers (, , , and ). Notably, culture results were excluded to allow for a 24-hour diagnosis. After splitting data into training (n = 83,272) and validation (n = 20,818) cohorts, performance was assessed against modified 2018 International Consensus Meeting criteria, including evaluation with probabilistically reclassified "inconclusive" cases. Results The ML model and resulting PJI score showed high diagnostic accuracy in the validation cohort. The PJI score achieved 99.3% sensitivity and 99.5% specificity versus the clinical reference before reclassification of inconclusive cases and 98.1% sensitivity and 97.6% specificity after probabilistic reclassification. With a disease prevalence of 20.7%, the positive predictive value reached 91.5% and the negative predictive value 99.5%. The model resolved 95% (1,363/1,442) of samples deemed inconclusive by the clinical standard. The analysis identified alpha defensin, percentage of neutrophils, and white blood cell count as the most influential model features. The model performed well in culture-negative infections. Conclusions The ML model and resulting PJI score demonstrated exceptional diagnostic accuracy by leveraging unsupervised SF biomarker pattern clustering. The model substantially reduced diagnostic uncertainty by definitively classifying most inconclusive cases, revealing their natural alignment with infected or non-infected patterns. This performance was achieved without SF culture results, enabling definitive diagnostic information within 24 hours based solely on biomarkers. The clinical significance demonstrates that an ML algorithm can match the diagnostic accuracy of complex clinical standards while transferring analytical complexity from clinicians to laboratories, minimizing the implementation gap that hinders current criteria-based approaches.
Authors
Keywords
No keywords available for this article.