Leveraging Data Pipeline and LLM to Advance Patient Safety Event Studies.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Research utilizing the open-access MAUDE database frequently reveals unclear methodologies for extracting and processing medical device report (MDR) data, reducing reproducibility and consistency. By harnessing the OpenFDA API and our MAUDE extract-transform-load (ETL) pipeline that standardizes the extraction and transformation of MDR data, this project explores how a large language model (LLM) can be employed to analyze free-text narratives in MDRs, enhancing the accuracy and efficiency of event categorization. The ETL-LLM approach is demonstrated through MDRs related to endoscopic mucosal resection, with potential applications extending to other devices and patient issues. Additional efforts are necessary to expand the size and diversity of the MDR sample to improve data-driven patient safety research.

Authors

  • Fagun Shah
    University of Texas at Dallas, Richardson, Texas, USA.
  • Yue Yu
    Department of Mathematics, Lehigh University, Bethlehem, PA, USA.
  • Yuheng Shi
    University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Yang Gong
    School of Biomedical Informatics, University of Texas Health Science Center, Houston, Texas, USA.