SHARK: web server for alignment-free homology assessment for intrinsically disordered and unalignable protein regions.

Journal: Nucleic acids research
Published Date:

Abstract

Whereas alignment has been fundamental to sequence-based assessments of protein homology, it is ineffective for intrinsically disordered regions (IDRs) due to their lowered sequence conservation and unique sequence properties. Here, we present a web server implementation of SHARK (bio-shark.org), an alignment-free algorithm for homology classification that compares the overall amino acid composition and short regions (k-mers) shared between sequences (SHARK-scores). The output of such k-mer-based comparisons is used by SHARK-dive, a machine learning classifier to detect homology between unalignable, disordered sequences. SHARK-web provides sequence-versus-database assessment of protein sequence homology akin to conventional tools such as BLAST and HMMER. Additionally, we provide precomputed sets of IDR sequences from 16 model organism proteomes facilitating searches against species-specific IDR-omes. SHARK-dive offers superior overall homology detection performance to BLAST and HMMER, driven by a large increase in sensitivity to low sequence identity homologs, and can be used to facilitate the study of sequence-function relationships in disordered, difficult-to-align regions.

Authors

  • Chi Fung Willis Chow
    Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany.
  • Maxim Scheremetjew
    Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany.
  • HongKee Moon
    Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany.
  • Soumyadeep Ghosh
    Department of Radiology, Massachusetts General Hospital, Boston, MA, United States. Electronic address: sghosh18@mgh.harvard.edu.
  • Anna Hadarovich
    Computational Biology Program, The University of Kansas, Lawrence, Kansas.
  • Lena Hersemann
    Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany.
  • Agnes Toth-Petroczy
    Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany. toth-petroczy@mpi-cbg.de.