Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering.

Journal: Cell research
Published Date:

Abstract

Mutations in amino acid sequences can provoke changes in protein function. Accurate and unsupervised prediction of mutation effects is critical in biotechnology and biomedicine, but remains a fundamental challenge. To resolve this challenge, here we present Protein Mutational Effect Predictor (ProMEP), a general and multiple sequence alignment-free method that enables zero-shot prediction of mutation effects. A multimodal deep representation learning model embedded in ProMEP was developed to comprehensively learn both sequence and structure contexts from ~160 million proteins. ProMEP achieves state-of-the-art performance in mutational effect prediction and accomplishes a tremendous improvement in speed, enabling efficient and intelligent protein engineering. Specifically, ProMEP accurately forecasts mutational consequences on the gene-editing enzymes TnpB and TadA, and successfully guides the development of high-performance gene-editing tools with their engineered variants. The gene-editing efficiency of a 5-site mutant of TnpB reaches up to 74.04% (vs 24.66% for the wild type); and the base editing tool developed on the basis of a TadA 15-site mutant (in addition to the A106V/D108N double mutation that renders deoxyadenosine deaminase activity to TadA) exhibits an A-to-G conversion frequency of up to 77.27% (vs 69.80% for ABE8e, a previous TadA-based adenine base editor) with significantly reduced bystander and off-target effects compared to ABE8e. ProMEP not only showcases superior performance in predicting mutational effects on proteins but also demonstrates a great capability to guide protein engineering. Therefore, ProMEP enables efficient exploration of the gigantic protein space and facilitates practical design of proteins, thereby advancing studies in biomedicine and synthetic biology.

Authors

  • Peng Cheng
    University of Kansas Medical Center, Department of Internal Medicine, Division of Medical Informatics, Kansas City, KS, USA.
  • Cong Mao
    State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China.
  • Jin Tang
    Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Computer Science and Technology, Anhui University, Hefei, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China.
  • Sen Yang
    Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
  • Yu Cheng
  • Wuke Wang
    Zhejiang Lab, Hangzhou, Zhejiang, China.
  • Qiuxi Gu
    State Key Laboratory of Reproductive Medicine and Offspring Health, Women's Hospital of Nanjing Medical University, Nanjing Maternity and Child Health Care Hospital, Nanjing Medical University, Nanjing, Jiangsu, China.
  • Wei Han
    Department of Pharmacology, The Key Laboratory of Neural and Vascular Biology, The Key Laboratory of New Drug Pharmacology and Toxicology, Ministry of Education, Collaborative Innovation Center of Hebei Province for Mechanism, Diagnosis and Treatment of Neuropsychiatric Diseases, Hebei Medical University, Shijiazhuang, Hebei, China.
  • Hao Chen
    The First School of Medicine, Wenzhou Medical University, Wenzhou, China.
  • Sihan Li
    Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia.
  • Yaofeng Chen
    Beijing Institute of Radiation Medicine, Beijing, 100850, China.
  • Jianglin Zhou
    Bioinformatics Center of AMMS, Beijing, China.
  • Wuju Li
    Bioinformatics Center of AMMS, Beijing, China.
  • Aimin Pan
    Zhejiang Lab, Hangzhou, Zhejiang, China.
  • Suwen Zhao
    iHuman Institute, ShanghaiTech University, Shanghai, China.
  • Xingxu Huang
    Zhejiang Lab, Hangzhou, Zhejiang, China.
  • Shiqiang Zhu
  • Jun Zhang
    First School of Clinical Medicine, Shandong University of Traditional Chinese Medicine, Jinan, China.
  • Wenjie Shu
    Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
  • Shengqi Wang
    Beijing Institute of Radiation Medicine, Beijing, 100850, PR China; Beijing Key Laboratory of New Molecular Diagnosis Technologies for Infectious Diseases, Beijing, 100850, PR China. Electronic address: sqwang@bmi.ac.cn.