Synthetic data in medicine: Legal and ethical considerations for patient profiling.

Journal: Computational and structural biotechnology journal
Published Date:

Abstract

Synthetic data is increasingly used in healthcare to facilitate privacy-preserving research, algorithm training, and patient profiling. By mimicking the statistical properties of real data without exposing identifiable information, synthetic data promises to resolve tensions between innovation and data protection. However, its legal and ethical implications remain insufficiently examined, particularly within the European Union (EU) regulatory landscape. This paper contributes to the emerging field of synthetic data governance by proposing a differentiated legal-ethical framework tailored to EU law. This paper follows a three-part taxonomy of synthetic data (fully synthetic, partially synthetic, and hybrid synthetic data) based on generation methods and identifiability risk. This taxonomy is situated within the broader context of the General Data Protection Regulation, the Artificial Intelligence Act, and the Medical Devices Regulation, clarifying when and how synthetic data may fall under EU regulatory scope. Focusing on patient profiling as a high-risk use case, the paper shows that while fully synthetic data may not constitute personal data, its downstream application in clinical or decision-making systems can still raise fairness, bias, and accountability concerns. The ethical analysis of profiling practices utilizing synthetic data is conducted through the lens of the four foundational biomedical principles: autonomy, beneficence, non-maleficence, and justice. The paper calls for sector-specific standards, generation quality benchmarks, and governance mechanisms aligning technical innovation with legal compliance and ethical integrity in digital health.

Authors

  • Maja Nisevic
    CiTiP KUL, Belgium.
  • Dusko Milojevic
    CiTiP KUL, Belgium.
  • Daniela Spajic
    CiTiP KUL, Belgium.

Keywords

No keywords available for this article.