Museum collections and machine learning guide discovery of novel coronaviruses and paramyxoviruses
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
Natural history museum collections are valuable but underutilized resources for viral discovery, offering opportunities to test hypotheses about viral occurrence across space, time, and taxonomic groups. We developed machine learning models of bat host suitability to guide coronavirus and paramyxovirus screening of 1330 and 491 tissues, respectively, in a museum collection. For the first time, we recovered coronavirus (n = 16) and paramyxovirus (n = 3) sequences from archived museum tissues, confirming three novel coronavirus host species and three novel paramyxovirus host species (3% and 33% prediction success rate, respectively). These sequences included a SARS-like coronavirus and an orthoparamyxovirus from Angolan Rhinolophus fumigatus specimens collected in June 2019, suggesting that viruses with epidemic potential may be more widespread in sub-Saharan Africa than previously believed. Our study demonstrates the value of combining predictive modeling and collections-based viral discovery, particularly for filling outstanding sampling gaps and investigating changes in host–virus associations over time.