EUP: Enhanced cross-species prediction of ubiquitination sites via a conditional variational autoencoder network based on ESM2.

Journal: PLoS computational biology
Published Date:

Abstract

Ubiquitination is critical in biomedical research. Predicting ubiquitination sites based on deep learning model have advanced the study of ubiquitination. However, traditional supervised model limits in the scenarios where labels are scarcity across species. To address this issue, we introduce EUP, an online webserver for ubiquitination prediction and model interpretation for multi-species. EUP is constructed by extracting lysine site-dependent features from pretrained language model ESM2. Then, utilizing conditional variational inference to reduce the ESM2 features to a lower-dimensional latent representation. By constructing downstream models built on this latent feature representation, EUP exhibited superior performance in predicting ubiquitination sites across species, while maintaining low inference latency. Furthermore, key features for predicting ubiquitination sites were identified across animals, plants, and microbes. The identification of shared key features that capture evolutionarily conserved traits enhances the interpretability of the EUP model for ubiquitination prediction. EUP is free and available at (https://eup.aibtit.com/).

Authors

  • Junhao Liu
    Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China. Electronic address: jh.liu@siat.ac.cn.
  • Zeyu Luo
    Chongqing Key Laboratory of Vector Insects.
  • Rui Wang
    Department of Clinical Laboratory Medicine Center, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia, China.
  • Xin Li
    Veterinary Diagnostic Center, Shanghai Animal Disease Control Center, Shanghai, China.
  • Yawen Sun
  • Zongqing Chen
    School of Mathematical Sciences.
  • Yu-Juan Zhang
    Chongqing Key Laboratory of Vector Insects.