Computational Discovery of CRISPR-Cas13b Guide RNAs for Broad-Spectrum Dengue Virus Targeting
Journal:
bioRxiv
Published Date:
Jan 21, 2026
Abstract
Dengue (DENV), an RNA virus, remains a significant global health threat, particularly in developing regions, with no widely effective antiviral therapy available. The CRISPR-Cas13b system, specifically the PspCas13b subtype, has emerged as a promising programmable antiviral tool capable of targeting viral RNA with high specificity. However, the efficacy of Cas13b-based interventions relies heavily on the design of potent and conserved CRISPR RNA (crRNA) spacer sequences, a task complicated by high viral genetic diversity. Unlike CRISPR-Cas9, which targets double-stranded DNA in eukaryotic genomes, Cas13b directly targets single-stranded RNA, making it ideally suited for RNA virus therapeutics; however, existing computational tools predominantly focus on Cas9 DNA targeting or Cas13d for mammalian transcript knockdown, leaving a significant gap for Cas13b-specific viral antiviral design. In this paper, we propose a computational pipeline and machine learning framework for the rational design of high-efficacy Cas13b guide RNAs targeting all four Dengue serotypes. Our approach integrates large-scale genomic data extraction, conservation analysis, and a novel in silico optimization module for guide RNA (gRNA) sequences, based on recently reported Cas13b design rules (e.g., 5' GG motif preference, Cytosine penalties). To predict targeting efficiency, we benchmark classical machine learning models (Random Forest, XGBoost) against foundation model-based predictors (Nucleotide Transformer, RNA-FM) using a dataset of experimentally validated spacers. Our results demonstrate that classical feature-engineered models significantly outperform deep learning approaches when trained on experimentally validated gRNA datasets in low-data regimes. We identify highly conserved, optimized crRNA candidates, including several pan-serotype guides with predicted high potency. This work establishes a baseline for Cas13b efficiency prediction and provides a robust computational resource for accelerating the development of CRISPR-based antivirals against Dengue and other RNA viruses.