Privacy and Security Throughout the Health Data Life Cycle: From Primary Care to Research Networks.

Journal: Annual review of biomedical data science
Published Date:

Abstract

Health data are increasingly generated, shared, and analyzed across an ever-growing collection of settings. While these developments enable new forms of biomedical discovery and clinical decision support, they also introduce evolving privacy, security, and trust challenges that extend beyond traditional regulatory and technical frameworks. In this review, we characterize the various risks and protections throughout the health data life cycle, from data generation and primary use in healthcare to secondary use in research and artificial intelligence (AI) model development. We discuss how regulation, organizational practices, and technological choices shape data protection requirements, and we discuss and contextualize emerging threats, such as incidental disclosures through AI tools. We further review technical approaches for mitigating these risks, including access control and auditing, reidentification risk assessment and statistical mechanisms for risk mitigation (e.g., differential privacy), and synthetic data generation. We also consider how collaboration across disparate organizations may be achieved through federated learning mechanisms and cryptographic technologies, such as secure multiparty computation. Throughout, we highlight trade-offs between privacy protection and data utility, and we articulate practical challenges in deploying these methods at scale. We conclude by identifying open issues for the field, including the need for standardized metrics and greater transparency to support trust in data-driven healthcare and research.

Authors

Keywords

No keywords available for this article.