Development of a Natural Language Processing (NLP) model to automatically extract clinical data from electronic health records: results from an Italian comprehensive stroke center.

Journal: International journal of medical informatics
Published Date:

Abstract

INTRODUCTION: Data collection often relies on time-consuming manual inputs, with a vast amount of information embedded in unstructured texts such as patients' medical records and clinical notes. Our study aims to develop a pipeline that combines active learning (AL) and NLP techniques to enhance data extraction in an acute ischemic stroke cohort.

Authors

  • Davide Badalotti
    Department of Computing Sciences, Bocconi University, Milano, Italy; Artificial Intelligence Center, Humanitas Clinical and Research Center - IRCCS, Via A. Manzoni 56, Rozzano 20089, Milan, Italy. Electronic address: d.badalotti@campus.unimib.it.
  • Akanksha Agrawal
    Department of Biomedical Sciences, Humanitas University, via Rita Levi Montalcini 4, 20072 Pieve Emanuele, Milan, Italy.
  • Umberto Pensato
    Department of Biomedical Sciences, Humanitas University, via Rita Levi Montalcini 4, 20072 Pieve Emanuele, Milan, Italy; IRCCS Humanitas Research Hospital, via Manzoni 56, 20089 Rozzano, Milan, Italy.
  • Giovanni Angelotti
    Artificial Intelligence Center, IRCCS Humanitas Research Hospital, Via Manzoni 56, Rozzano, Milan 20089, Italy.
  • Simona Marcheselli
    Stroke Unit, IRCCS Humanitas Research Hospital, 20089 Milan, Italy.