Leveraging Data Pipeline and LLM to Advance Patient Safety Event Studies.
Journal:
Studies in health technology and informatics
Published Date:
May 15, 2025
Abstract
Research utilizing the open-access MAUDE database frequently reveals unclear methodologies for extracting and processing medical device report (MDR) data, reducing reproducibility and consistency. By harnessing the OpenFDA API and our MAUDE extract-transform-load (ETL) pipeline that standardizes the extraction and transformation of MDR data, this project explores how a large language model (LLM) can be employed to analyze free-text narratives in MDRs, enhancing the accuracy and efficiency of event categorization. The ETL-LLM approach is demonstrated through MDRs related to endoscopic mucosal resection, with potential applications extending to other devices and patient issues. Additional efforts are necessary to expand the size and diversity of the MDR sample to improve data-driven patient safety research.