Efficacy of automated machine learning models and feature engineering for diagnosis of equivocal appendicitis using clinical and computed tomography findings.

Journal: Scientific reports
PMID:

Abstract

This study evaluates the diagnostic efficacy of automated machine learning (AutoGluon) with automated feature engineering and selection (autofeat), focusing on clinical manifestations, and a model integrating both clinical manifestations and CT findings in adult patients with ambiguous computed tomography (CT) results for acute appendicitis (AA). This evaluation was compared with conventional single machine learning models such as logistic regression(LR) and established scoring systems such as the Adult Appendicitis Score(AAS) to address the gap in diagnostic approaches for uncertain AA cases. In this retrospective analysis of 303 adult patients with indeterminate CT findings, the cohort was divided into appendicitis (n = 115) and non-appendicitis (n = 188) groups. AutoGluon and autofeat were used for AA prediction. The AutoGluon-clinical model relied solely on clinical data, whereas the AutoGluon-clinical-CT model included both clinical and CT data. The area under the receiver operating characteristic curve (AUROC) and other metrics for the test dataset, namely accuracy, sensitivity, specificity, PPV, NPV, and F1 score, were used to compare AutoGluon models with single machine learning models and the AAS. The single ML models in this study were LR, LASSO regression, ridge regression, support vector machine, decision tree, random forest, and extreme gradient boosting. Feature importance values were extracted using the "feature_importance" attribute from AutoGluon. The AutoGluon-clinical model demonstrated an AUROC of 0.785 (95% CI 0.691-0.890), and the ridge regression model with only clinical data revealed an AUROC of 0.755 (95% CI 0.649-0.861). The AutoGluon-clinical-CT model (AUROC 0.886 with 95% CI 0.820-0.951) performed better than the ridge model using clinical and CT data (AUROC 0.852 with 95% CI 0.774-0.930, p = 0.029). A new feature, exp(-(duration from pain to CT) + rebound tenderness), was identified (importance = 0.049, p = 0.001). AutoML (AutoGluon) and autoFE (autofeat) enhanced the diagnosis of uncertain AA cases, particularly when combining CT and clinical findings. This study suggests the potential of integrating AutoML and autoFE in clinical settings to improve diagnostic strategies and patient outcomes and make more efficient use of healthcare resources. Moreover, this research supports further exploration of machine learning in diagnostic processes.

Authors

  • Juho An
    Department of Emergency Medicine, Ajou University School of Medicine, World Cup-ro, Suwon, Gyeonggi-do, 16499, South Korea.
  • Il Seok Kim
    Department of Anesthesiology and Pain Medicine, Kangdong Sacred Hospital, Hallym University College of Medicine, Seongan-ro, Seoul, 05355, South Korea.
  • Kwang-Ju Kim
    Electronics and Telecommunications Research Institute (ETRI), Techno sunhwan-ro, Daegu, 42994, South Korea.
  • Ji Hyun Park
  • Hyuncheol Kang
    Department of Big Data and AI, Hoseo University, Hoseo-ro, Asan, Chungcheongnam-do, 31499, South Korea.
  • Hyuk Jung Kim
    Department of Radiology, Daejin Medical Center, Bundang Jesaeng General Hospital, Seohyeon-ro, Seongnam, Gyeonggi-do, 13590, South Korea.
  • Young Sik Kim
    Department of Emergency Medicine, Daejin Medical Center, Bundang Jesaeng General Hospital, Seohyeon-ro, Seongnam, Gyeonggi-do, 13590, South Korea.
  • Jung Hwan Ahn
    Department of Emergency Medicine, Ajou University School of Medicine, World Cup-ro, Suwon, Gyeonggi-do, 16499, South Korea. erdrajh@naver.com.