Using Machine Learning Models to Diagnose Chronic Rhinosinusitis: Analysis of Pre-Treatment Patient-Generated Health Data to Predict Cardinal Symptoms and Sinonasal Inflammation.
Journal:
American journal of rhinology & allergy
PMID:
40070054
Abstract
BackgroundThe diagnosis of chronic rhinosinusitis (CRS) relies upon patient-reported symptoms and objective nasal endoscopy and/or computed tomography (CT) findings. Many patients, at the time of evaluation by an otolaryngologist or rhinologist, lack objective findings confirming CRS and do not have this disease.ObjectiveWe hypothesized that a machine learning model (MLM) could predict probable CRS using patient-reported data acquired prior to rhinologist-directed treatment. We leveraged patient-generated health data using a machine learning approach to predict: (1) the primary endpoint of sinonasal inflammation on CT evidenced by a Lund-Mackay score (LMS) ≥ 5 and (2) the secondary endpoint of LMS ≥ 5 and ≥2 cardinal symptoms of CRS.Methods543 patients were evaluated at a tertiary care rhinology clinic and subsequently underwent CT imaging with LMS. Patient-reported outcome measures and additional patient data were collected via an electronic platform prior to in-person evaluation. Three MLMs, a random forest classifier, a deep neural network, and an extreme gradient Boost (XGBoost) algorithm, were trained on predictors drawn from patient-generated health data and tested on a naïve test set (90:10 training:test set split). Cross-validation was executed, and model performance compared between algorithms and with linear regression techniques.Results57 predictors were extracted from the patient-generated health data. The best model (XGBoost) achieved an area-under-the-curve (AUC) of 71.3% (accuracy 74.5%, sensitivity 38.9%, specificity 91.9%) in predicting the primary endpoint, and an AUC of 79.8% (accuracy 85.5%, sensitivity 36.4%, specificity 97.7%) in predicting the secondary endpoint. This exceeded the performance of a linear regression model.ConclusionA MLM using patient-generated health data accurately predicted patients with probable CRS (≥2 cardinal symptoms and LMS ≥ 5). With further validation on a larger cohort, such a tool could potentially be used by otolaryngologists to inform clinical utility of diagnostic imaging and for screening prior to subspecialty Rhinology referral.