Challenges in the annotation of pseudoenzymes in databases: the UniProtKB approach.

Journal: The FEBS journal
Published Date:

Abstract

The universal protein knowledgebase (UniProtKB) collects and centralises functional information on proteins across a wide range of species. In addition to the functional information added to all protein entries, for enzymes, which represent 20-40% of most proteomes, UniProtKB provides additional information about Enzyme Commission classification, catalytic activity, cofactors, enzyme regulation, kinetics and pathways, all based on critical assessment of published experimental data. Computer-based analysis and structural data are used to enrich the annotation of the sequence through the identification of active sites and binding sites. While the annotation of enzymes is well-defined, the curation of pseudoenzymes in UniProtKB has highlighted some challenges: how to identify them, how to assess their lack of catalytic activity, how to annotate their lack of catalytic activity in a consistent way and how much can be inferred and propagated from experimental data obtained from other species. Through various examples, we illustrate some of these issues and discuss some of the changes we propose to enhance the annotation and discovery of pseudoenzymes. Ultimately, improving the curation of pseudoenzymes will provide the scientific community with a comprehensive resource for pseudoenzymes, which in turn will lead to a better understanding of the evolution of these molecules, the aetiology of related diseases and the development of drugs.

Authors

  • Rossana Zaru
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK.
  • Michele Magrane
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK.
  • Sandra Orchard
    Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland, Department of Microbiology and Immunology and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland, Department of Medicine and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA, School of Information, University of South Florida, Tampa, FL, 33647, USA, Genomics Division, Lawrence Berkeley National Lab, 1 Cyclotron Rd., Berkeley, 94720 CA USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland, ETH Zurich, Department of Computer Science, Universitätstr. 19, 8092 Zürich, Switzerland, SIB Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zürich, Switzerland and University College London, Gower St, London WC1E 6BT, UK.