E-SNPs&GO: embedding of protein sequence and function improves the annotation of human pathogenic variants.

Journal: Bioinformatics (Oxford, England)
PMID:

Abstract

MOTIVATION: The advent of massive DNA sequencing technologies is producing a huge number of human single-nucleotide polymorphisms occurring in protein-coding regions and possibly changing their sequences. Discriminating harmful protein variations from neutral ones is one of the crucial challenges in precision medicine. Computational tools based on artificial intelligence provide models for protein sequence encoding, bypassing database searches for evolutionary information. We leverage the new encoding schemes for an efficient annotation of protein variants.

Authors

  • Matteo Manfredi
    Department of Urology, "San Luigi Gonzaga" Hospital, University of Turin, Orbassano (Turin), Italy.
  • Castrense Savojardo
  • Pier Luigi Martelli
    Biocomputing Group, CIRI Health Sciences & Technologies (HST), University of Bologna, Bologna, Italy.
  • Rita Casadio