Divide-and-conquer: machine-learning integrates mammalian and viral traits with network features to predict virus-mammal associations.

Journal: Nature communications
PMID:

Abstract

Our knowledge of viral host ranges remains limited. Completing this picture by identifying unknown hosts of known viruses is an important research aim that can help identify and mitigate zoonotic and animal-disease risks, such as spill-over from animal reservoirs into human populations. To address this knowledge-gap we apply a divide-and-conquer approach which separates viral, mammalian and network features into three unique perspectives, each predicting associations independently to enhance predictive power. Our approach predicts over 20,000 unknown associations between known viruses and susceptible mammalian species, suggesting that current knowledge underestimates the number of associations in wild and semi-domesticated mammals by a factor of 4.3, and the average potential mammalian host-range of viruses by a factor of 3.2. In particular, our results highlight a significant knowledge gap in the wild reservoirs of important zoonotic and domesticated mammals' viruses: specifically, lyssaviruses, bornaviruses and rotaviruses.

Authors

  • Maya Wardeh
    Department of Epidemiology and Population Health, Institute of Infection and Global Health, University of Liverpool, Liverpool Science Park IC2 Building, 146 Brownlow Hill, Liverpool L3 5RF, UK.
  • Marcus S C Blagrove
    Department of Evolution, Ecology and Behaviour, Institute of Infection, Veterinary & Ecological Sciences, University of Liverpool, Liverpool, UK.
  • Kieran J Sharkey
    Department of Mathematical Sciences, University of Liverpool, Peach Street, Liverpool L69 7ZL, UK.
  • Matthew Baylis
    Department of Epidemiology and Population Health, Institute of Infection and Global Health, University of Liverpool, Leahurst Campus, Chester High Road, Neston CH64 7TE, UK.