False-positive tolerant model misconduct mitigation in distributed federated learning on electronic health record data across clinical institutions.

Journal: Scientific reports
Published Date:

Abstract

As collaborative Machine Learning on cross-institutional, fully distributed networks become an important tool in predictive health modeling, its inherent security risks must be addressed. One among such risks is the lack of a mitigation strategy against injections of tampered models (or "model misconduct") into the collaborative pipeline, without excessive false alarms. We propose a false-positive tolerant methodology to preserve model integrity and mitigate the impacts of adversarial misconduct in a Federated Learning scenario. Our method grants false-positive detection leeway as each participating site would be quarantined only when their misbehavior allowance (or "budget") is exhausted, preventing over-ostracization and the subsequent loss of sample size. We evaluated our system on a decentralized blockchain network with three datasets. Our method demonstrated gains of 0.058-0.121 Area Under the Receiver Operating Characteristic Curve when compared to the solution without false-positive tolerance, with negligible overhead (< 12 milliseconds). Our method can serve as an efficient and robust approach to safeguard the distributed Federated Learning process in collaborative healthcare, inheriting blockchain's characteristics of transparency and decentralization. Future work can apply our methodology to more sophisticated machine learning algorithms or scale the experiments in a large-scale learning scenario with more participating sites and larger sample sizes.

Authors

  • Maxim Edelson
    Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
  • Anh Pham
    Deloitte Consulting LLP, Atlanta, GA, USA.
  • Tsung-Ting Kuo
    University of California San Diego, La Jolla, CA.