Learning to utilize internal protein 3D nanoenvironment descriptors in predicting CRISPR-Cas9 off-target activity.

Journal: NAR genomics and bioinformatics
Published Date:

Abstract

Despite advances in determining the factors influencing cleavage activity of a CRISPR-Cas9 single guide RNA (sgRNA) at an (off-)target DNA sequence, a comprehensive assessment of pertinent physico-chemical/structural descriptors is missing. In particular, studies have not yet directly exploited the information-rich internal protein 3D nanoenvironment of the sgRNA-(off-)target strand DNA pair, which we obtain by harvesting 634 980 residue-level features for CRISPR-Cas9 complexes. As a proof-of-concept study, we simulated the internal protein 3D nanoenvironment for all experimentally available single-base protospacer-adjacent motif-distal mutations for a given sgRNA-target strand pair. By determining the most relevant residue-level features for CRISPR-Cas9 off-target cleavage activity, we developed STING_CRISPR, a machine learning model delivering accurate predictive performance of off-target cleavage activity for the type of single-base mutations considered in this study. By interpreting STING_CRISPR, we identified four important Cas9 residue spatial hotspots and associated structural/physico-chemical descriptor classes influencing CRISPR-Cas9 (off-)target cleavage activity for the sgRNA-target strand pairs covered in this study.

Authors

  • Jeffrey Kelvin Mak
    Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, United Kingdom.
  • Artemi Bendandi
    CONCEPT Lab, Istituto Italiano di Tecnologia, Via Melen - 83, B Block, 16152Genova, Italy.
  • José Augusto Salim
    Department of Plant Biology, Institute of Biology, University of Campinas - UNICAMP, SP, 13083-872, Brazil.
  • Ivan Mazoni
    Computational Biology Research Group, Embrapa Digital Agriculture, Campinas, SP, 13083-886, Brazil.
  • Fabio Rogerio de Moraes
    Physics Department, Institute of Biosciences, Languages, and Exact Sciences (IBILCE), São Paulo State University (Unesp), São José do Rio Preto, SP, 15054-000, Brazil.
  • Luiz Borro
    beOn Claro, São Paulo, SP, 04709-110, Brazil.
  • Florian Störtz
    Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, United Kingdom.
  • Walter Rocchia
    CONCEPT Lab, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy.
  • Goran Neshich
    Computational Biology Research Group, Embrapa Digital Agriculture, Campinas, SP, 13083-886, Brazil.
  • Peter Minary
    Department of Computer Science, University of Oxford, Oxford, UK. peter.minary@cs.ox.ac.uk.