Privacy-preserving Model Training for Disease Prediction Using Federated Learning with Differential Privacy.

Journal: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference

Published Date: Jul 1, 2022

Abstract

Machine learning is playing an increasingly critical role in health science with its capability of inferring valuable information from high-dimensional data. More training data provides greater statistical power to generate better models that can help decision-making in healthcare. However, this often requires combining research and patient data across institutions and hospitals, which is not always possible due to privacy considerations. In this paper, we outline a simple federated learning algorithm implementing differential privacy to ensure privacy when training a machine learning model on data spread across different institutions. We tested our model by predicting breast cancer status from gene expression data. Our model achieves a similar level of accuracy and precision as a single-site non-private neural network model when we enforce privacy. This result suggests that our algorithm is an effective method of implementing differential privacy with federated learning, and clinical data scientists can use our general framework to produce differentially private models on federated datasets. Our framework is available at https://github.com/gersteinlab/idash20FL.

Authors

Amol Khanna
Vincent Schaffer
Gamze Gürsoy

Department of Biomedical Informatics, Department of Computer Science, Columbia University, New York Genome Center, New York, NY, USA.
Mark Gerstein

Program of Computational Biology and Bioinformatics and Department of Molecular Biophysics and Biochemistry and Department of Computer Science, Yale University, New Haven, CT 06511, USA.

Keywords

Algorithms Humans Machine Learning Privacy

External Resources

View on PubMed Access via DOI PubMed (36086138)

Privacy-preserving Model Training for Disease Prediction Using Federated Learning with Differential Privacy.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals