Empirical Comparison and Analysis of Artificial Intelligence-Based Methods for Identifying Phosphorylation Sites of SARS-CoV-2 Infection.

Journal: International journal of molecular sciences
PMID:

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a member of the large coronavirus family with high infectivity and pathogenicity and is the primary pathogen causing the global pandemic of coronavirus disease 2019 (COVID-19). Phosphorylation is a major type of protein post-translational modification that plays an essential role in the process of SARS-CoV-2-host interactions. The precise identification of phosphorylation sites in host cells infected with SARS-CoV-2 will be of great importance to investigate potential antiviral responses and mechanisms and exploit novel targets for therapeutic development. Numerous computational tools have been developed on the basis of phosphoproteomic data generated by mass spectrometry-based experimental techniques, with which phosphorylation sites can be accurately ascertained across the whole SARS-CoV-2-infected proteomes. In this work, we have comprehensively reviewed several major aspects of the construction strategies and availability of these predictors, including benchmark dataset preparation, feature extraction and refinement methods, machine learning algorithms and deep learning architectures, model evaluation approaches and metrics, and publicly available web servers and packages. We have highlighted and compared the prediction performance of each tool on the independent serine/threonine (S/T) and tyrosine (Y) phosphorylation datasets and discussed the overall limitations of current existing predictors. In summary, this review would provide pertinent insights into the exploitation of new powerful phosphorylation site identification tools, facilitate the localization of more suitable target molecules for experimental verification, and contribute to the development of antiviral therapies.

Authors

  • Hongyan Lai
    Department of Medical Oncology, Shanghai Key Laboratory of Medical Epigenetics, Fudan University Shanghai Cancer Center, Institutes of Biomedical Sciences, Fudan University, 270 Dong An Rd, Shanghai, 200032, China.
  • Tao Zhu
    Wuhan Zoncare Bio-Medical Electronics Co., Ltd, Wuhan, China.
  • Sijia Xie
    Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Xinwei Luo
    Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Feitong Hong
    Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Diyu Luo
    Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
  • Fuying Dao
    School of Biological Sciences, Nanyang Technological University, Singapore 639798, Singapore.
  • Hao Lin
    Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, China.
  • Kunxian Shu
    Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.
  • Hao Lv
    1 Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China , Chengdu, China .