Domain Knowledge-Driven Generation of Synthetic Healthcare Data.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Healthcare longitudinal data collected around patients' life cycles, today offer a multitude of opportunities for healthcare transformation utilizing artificial intelligence algorithms. However, access to "real" healthcare data is a big challenge due to ethical and legal reasons. There is also a need to deal with challenges around electronic health records (EHRs) including biased, heterogeneity, imbalanced data, and small sample sizes. In this study, we introduce a domain knowledge-driven framework for generating synthetic EHRs, as an alternative to methods only using EHR data or expert knowledge. By leveraging external medical knowledge sources in the training algorithm, the suggested framework is designed to maintain data utility, fidelity, and clinical validity while preserving patient privacy.

Authors

  • Atiye Sadat Hashemi
    Center for Applied Intelligent Systems Research in Health, Halmstad University, Sweden.
  • Amira Soliman
    Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Halmstad, Sweden.
  • Jens Lundström
    Center for Applied Intelligent Systems Research in Health, Halmstad University, Sweden.
  • Kobra Etminani
    Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. Electronic address: EtminaniK@mums.ac.ir.