A machine learning tool for early identification of celiac disease autoimmunity.

Journal: Scientific reports
PMID:

Abstract

Identifying which patients should undergo serologic screening for celiac disease (CD) may help diagnose patients who otherwise often experience diagnostic delays or remain undiagnosed. Using anonymized outpatient data from the electronic medical records of Maccabi Healthcare Services, we developed and evaluated five machine learning models to classify patients as at-risk for CD autoimmunity prior to first documented diagnosis or positive serum tissue transglutaminase (tTG-IgA). A train set of highly seropositive (tTG-IgA > 10X ULN) cases (n = 677) with likely CD and controls (n = 176,293) with no evidence of CD autoimmunity was used for model development. Input features included demographic information and commonly available laboratory results. The models were then evaluated for discriminative ability as measured by AUC on a distinct set of highly seropositive cases (n = 153) and controls (n = 41,087). The highest performing model was XGBoost (AUC = 0.86), followed by logistic regression (AUC = 0.85), random forest (AUC = 0.83), multilayer perceptron (AUC = 0.80) and decision tree (AUC = 0.77). Contributing features for the XGBoost model for classifying a patient as at-risk for undiagnosed CD autoimmunity included signs of anemia, transaminitis and decreased high-density lipoprotein. This model's ability to distinguish cases of incident CD autoimmunity from controls shows promise as a potential clinical tool to identify patients with increased risk of having undiagnosed celiac disease in the community, for serologic screening.

Authors

  • Michael Dreyfuss
    Predicta Med Analytics Ltd., Ramat Gan, Israel. michael@predicta-med.com.
  • Benjamin Getz
    Predicta Med Analytics Ltd., Ramat Gan, Israel.
  • Benjamin Lebwohl
    Celiac Disease Center, Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.
  • Or Ramni
    Predicta Med Analytics Ltd., Ramat Gan, Israel.
  • Daniel Underberger
    Predicta Med Analytics Ltd., Ramat Gan, Israel.
  • Tahel Ilan Ber
    Predicta Med Analytics Ltd., Ramat Gan, Israel.
  • Shlomit Steinberg-Koch
    Predicta Med Analytics Ltd., Ramat Gan, Israel.
  • Yonatan Jenudi
    Translational Oncology Laboratory, The Hematology Institute and Blood Bank, Meir Medical Center, Tchernichovsky 59, 6997801, Kfar Saba, Israel.
  • Sivan Gazit
    MaccabiTech, Maccabi Healthcare Services, Tel Aviv, Israel.
  • Tal Patalon
    Kahn Sagol Maccabi Research & Innovation Center, Maccabi Healthcare Services, Tel Aviv, Israel.
  • Gabriel Chodick
    Maccabi Healthcare Services, Tel Aviv, Israel.
  • Yehuda Shoenfeld
    Zabludowicz Center for Autoimmune Diseases, Sheba Medical Center, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel.
  • Amir Ben-Tov
    Kahn Sagol Maccabi Research & Innovation Center, Maccabi Healthcare Services, Tel Aviv, Israel.