Accurate Prediction of CRISPR/Cas13a Guide Activity Using Feature Selection and Deep Learning.

Journal: Journal of chemical information and modeling
PMID:

Abstract

CRISPR/Cas13a serves as a key tool for nucleic acid tests; therefore, accurate prediction of its activity is essential for creating robust and sensitive diagnosis. In this study, we create a dual-branch neural network model that achieves high prediction accuracy and classification performance across two independent CRISPR/Cas13a data sets, outperforming previously published models relying solely on sequence features. The model integrates direct sequence encoding with descriptive features and yields 99 key descriptive features out of 1553, extracted through statistical analysis, which critically influence guide-target interactions and Cas13a guide activity. By employing Shapley Additive Explanations and Integrated Gradients for feature importance analysis, we show that sequence composition, mismatch type and frequency, and the protospacer flanking site region are primary features. These findings underscore the importance of using descriptive features as complementary inputs to deep learning-based encoding and provide valuable insights into the mechanisms underlying guide-target interaction. All in all, this study not only introduces a reliable and efficient model for Cas13a guide activity prediction but also offers a foundation for future rational design efforts.

Authors

  • Jiashun Fu
    Research Center for Analytical Sciences, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Xuyang Liu
    Department of Hepatobiliary Surgery, Affiliated Hospital of Guizhou Medical University, Guiyang, P. R. China.
  • Ruijie Deng
    College of Biomass Science and Engineering, Healthy Food Evaluation Research Center, Sichuan University, Chengdu 610065, China.
  • Xiue Jiang
    Research Center for Analytical Sciences, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Wensheng Cai
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Haohao Fu
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.
  • Xueguang Shao
    Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin 300071, China.