Identification of patients' smoking status using an explainable AI approach: a Danish electronic health records case study.

Journal: BMC medical research methodology
PMID:

Abstract

BACKGROUND: Smoking is a critical risk factor responsible for over eight million annual deaths worldwide. It is essential to obtain information on smoking habits to advance research and implement preventive measures such as screening of high-risk individuals. In most countries, including Denmark, smoking habits are not systematically recorded and at best documented within unstructured free-text segments of electronic health records (EHRs). This would require researchers and clinicians to manually navigate through extensive amounts of unstructured data, which is one of the main reasons that smoking habits are rarely integrated into larger studies. Our aim is to develop machine learning models to classify patients' smoking status from their EHRs.

Authors

  • Ali Ebrahimi
    Laboratory for Computational Sensing and Robotics, Johns Hopkins University, Baltimore, MD.
  • Margrethe Bang Høstgaard Henriksen
    Department of Oncology, Lillebaelt Hospital, University Hospital of Southern Denmark, Vejle, 7100, Denmark.
  • Claus Lohman Brasen
    Department of Biochemistry and Immunology, Lillebaelt Hospital, Vejle, Denmark.
  • Ole Hilberg
    Institute of Regional Health Research, University of Southern Denmark, Odense, Denmark.
  • Torben Frøstrup Hansen
    Department of Oncology, Lillebaelt Hospital, University Hospital of Southern Denmark, Vejle, 7100, Denmark.
  • Lars Henrik Jensen
    Department of Oncology, Lillebaelt Hospital, University Hospital of Southern Denmark, Vejle, 7100, Denmark.
  • Abdolrahman Peimankar
    SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense 5230, Denmark. Electronic address: abpe@mmmi.sdu.dk.
  • Uffe Kock Wiil
    Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Denmark.