Early detection of emerging SARS-CoV-2 Variants from wastewater through genome sequencing and machine learning.

Journal: Nature communications
Published Date:

Abstract

Genome sequencing from wastewater enables accurate and cost-effective identification of SARS-CoV-2 variants. However, existing computational pipelines have limitations in detecting emerging variants not yet characterized in humans. Here, we present an unsupervised learning approach that clusters co-varying and time-evolving mutation patterns to identify SARS-CoV-2 variants. To build our model, we sequence 3659 wastewater samples collected over two years from urban and rural locations in Southern Nevada. We then develop a multivariate independent component analysis (ICA)-based pipeline to transform mutation frequencies into independent sources. These data-driven time-evolving and co-varying sources are compared to 8810 SARS-CoV-2 clinical genomes from Nevadans. Our method accurately detects the Delta variant in late 2021, Omicron variants in 2022, and emerging recombinant XBB variants in 2023. Our approach also reveals the spatial and temporal dynamics of variants in both urban and rural regions; achieves earlier detection of most variants compared to other computational tools; and uncovers unique co-varying mutation patterns not associated with any known variant. The multivariate nature of our pipeline boosts statistical power and supports accurate early detection of SARS-CoV-2 variants. This feature offers a unique opportunity to detect emerging variants and pathogens, even in the absence of clinical testing.

Authors

  • Xiaowei Zhuang
    Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV, 89106, USA.
  • Van Vo
    Laboratory of Neurogenetics and Precision Medicine, College of Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
  • Michael A Moshi
    Laboratory of Neurogenetics and Precision Medicine, College of Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
  • Ketan Dhede
    Laboratory of Neurogenetics and Precision Medicine, College of Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
  • Nabih Ghani
    Department of Medical Education, Kirk Kerkorian School of Medicine at UNLV, 625 Shadow Lane, Las Vegas, NV 89106, USA.
  • Shahraiz Akbar
    Laboratory of Neurogenetics and Precision Medicine, College of Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
  • Ching-Lan Chang
    Laboratory of Neurogenetics and Precision Medicine, College of Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
  • Angelia K Young
    Southern Nevada Health District, Las Vegas, NV, USA.
  • Erin Buttery
    Southern Nevada Health District, Las Vegas, NV, USA.
  • William Bendik
    Southern Nevada Health District, Las Vegas, NV, USA.
  • Hong Zhang
    Department of Anesthesiology and Operation, The First Hospital of Lanzhou University, Lanzhou, Gansu, China.
  • Salman Afzal
    Southern Nevada Health District, Las Vegas, NV, USA.
  • Duane Moser
    Division of Hydrologic Sciences, Desert Research Institute, Las Vegas, NV, USA.
  • Dietmar Cordes
    Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV, 89106, USA; Department of Psychology and Neuroscience, University of Colorado, Boulder, CO, 80309, USA. Electronic address: cordesd@ccf.org.
  • Cassius Lockett
    Southern Nevada Health District, Las Vegas, NV, USA. lockett@snhd.org.
  • Daniel Gerrity
    Southern Nevada Water Authority, P.O. Box 99954, Las Vegas, NV, USA. daniel.gerrity@snwa.com.
  • Horng-Yuan Kan
    Southern Nevada Health District, Las Vegas, NV, USA. kan@snhd.org.
  • Edwin C Oh
    Interdisciplinary Neuroscience PhD Program, University of Nevada, Las Vegas, NV, USA.