LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data.

Journal: PloS one
Published Date:

Abstract

Intensive care data are valuable for improvement of health care, policy making and many other purposes. Vast amount of such data are stored in different locations, on many different devices and in different data silos. Sharing data among different sources is a big challenge due to regulatory, operational and security reasons. One potential solution is federated machine learning, which is a method that sends machine learning algorithms simultaneously to all data sources, trains models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them. One challenge in applying federated machine learning is the possibly different distributions of data from diverse sources. To tackle this problem, we proposed an adaptive boosting method named LoAdaBoost that increases the efficiency of federated machine learning. Using intensive care unit data from hospitals, we investigated the performance of learning in IID and non-IID data distribution scenarios, and showed that the proposed LoAdaBoost method achieved higher predictive accuracy with lower computational complexity than the baseline method.

Authors

  • Li Huang
    National Research Center for Resettlement (NRCR), Hohai University, 1 Xikang Road, Nanjing 210098, China. lily8214@hhu.edu.cn.
  • Yifeng Yin
    University of Huddersfield, Huddersfield, England, United Kingdom.
  • Zeng Fu
    University of California San Diego, San Diego, California, United States of America.
  • Shifa Zhang
    Northeastern University, Boston, Massachusetts, United States of America.
  • Hao Deng
    Faculty of Information Technology, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, China.
  • Dianbo Liu
    Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge.