Reservoir hosts prediction for COVID-19 by hybrid transfer learning model.

Journal: Journal of biomedical informatics
Published Date:

Abstract

The recent outbreak of COVID-19 has infected millions of people around the world, which is leading to the global emergency. In the event of the virus outbreak, it is crucial to get the carriers of the virus timely and precisely, then the animal origins can be isolated for further infection. Traditional identifications rely on fields and laboratory researches that lag the responses to emerging epidemic prevention. With the development of machine learning, the efficiency of predicting the viral hosts has been demonstrated by recent researchers. However, the problems of the limited annotated virus data and imbalanced hosts information restrict these approaches to obtain a better result. To assure the high reliability of predicting the animal origins on COVID-19, we extend transfer learning and ensemble learning to present a hybrid transfer learning model. When predicting the hosts of newly discovered virus, our model provides a novel solution to utilize the related virus domain as auxiliary to help building a robust model for target virus domain. The simulation results on several UCI benchmarks and viral genome datasets demonstrate that our model outperforms the general classical methods under the condition of limited target training sets and class-imbalance problems. By setting the coronavirus as target domain and other related virus as source domain, the feasibility of our approach is evaluated. Finally, we show the animal reservoirs prediction of the COVID-19 for further analysing.

Authors

  • Yun Yang
    Department of Chemistry, South University of Science and Technology, Shenzhen 518055, China.
  • Jing Guo
    College of Chemical Engineering, Department of Pharmaceutical Engineering, Northwest University, Xi'an, Shaanxi, China.
  • Pei Wang
    College of Engineering and Technology, Key Laboratory of Agricultural Equipment for Hilly and Mountain Areas, Southwest University, Chongqing, China.
  • Yaowei Wang
    PengCheng Laboratory, China. Electronic address: wangyw@pcl.ac.cn.
  • Minghao Yu
    School of Software, Yunnan University, Kunming, China.
  • Xiang Wang
    Department of Thoracic Surgery, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China.
  • Po Yang
  • Liang Sun
    College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing, 211106, China; Department of Radiology and BRIC, University of North Carolina at Chapel Hill, North Carolina, 27599, USA.