Annotation of gene product function from high-throughput studies using the Gene Ontology.

Journal: Database : the journal of biological databases and curation
Published Date:

Abstract

High-throughput studies constitute an essential and valued source of information for researchers. However, high-throughput experimental workflows are often complex, with multiple data sets that may contain large numbers of false positives. The representation of high-throughput data in the Gene Ontology (GO) therefore presents a challenging annotation problem, when the overarching goal of GO curation is to provide the most precise view of a gene's role in biology. To address this, representatives from annotation teams within the GO Consortium reviewed high-throughput data annotation practices. We present an annotation framework for high-throughput studies that will facilitate good standards in GO curation and, through the use of new high-throughput evidence codes, increase the visibility of these annotations to the research community.

Authors

  • Helen Attrill
    FlyBase, Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge , UK.
  • Pascale Gaudet
    Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland, Department of Microbiology and Immunology and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland, Department of Medicine and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA, School of Information, University of South Florida, Tampa, FL, 33647, USA, Genomics Division, Lawrence Berkeley National Lab, 1 Cyclotron Rd., Berkeley, 94720 CA USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland, ETH Zurich, Department of Computer Science, Universitätstr. 19, 8092 Zürich, Switzerland, SIB Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zürich, Switzerland and University College London, Gower St, London WC1E 6BT, UK.
  • Rachael P Huntley
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK huntley@ebi.ac.uk.
  • Ruth C Lovering
    Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London, WC1E 6JF, UK. r.lovering@ucl.ac.uk.
  • Stacia R Engel
    Saccharomyces Genome Database, Department of Genetics, Stanford University, Porter Drive, Palo Alto, CA, USA.
  • Sylvain Poux
    Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland, Department of Microbiology and Immunology and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland, Department of Medicine and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA, School of Information, University of South Florida, Tampa, FL, 33647, USA, Genomics Division, Lawrence Berkeley National Lab, 1 Cyclotron Rd., Berkeley, 94720 CA USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland, ETH Zurich, Department of Computer Science, Universitätstr. 19, 8092 Zürich, Switzerland, SIB Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zürich, Switzerland and University College London, Gower St, London WC1E 6BT, UK.
  • Kimberly M Van Auken
    WormBase, Division of Biology and Biological Engineering, California Institute of Technology, E California Blvd, Pasadena, CA, USA.
  • George Georghiou
    European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
  • Marcus C Chibucos
    Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland, Department of Microbiology and Immunology and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland, Department of Medicine and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore MD, USA, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA, School of Information, University of South Florida, Tampa, FL, 33647, USA, Genomics Division, Lawrence Berkeley National Lab, 1 Cyclotron Rd., Berkeley, 94720 CA USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland, ETH Zurich, Department of Computer Science, Universitätstr. 19, 8092 Zürich, Switzerland, SIB Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zürich, Switzerland and University College London, Gower St, London WC1E 6BT, UK.
  • Tanya Z Berardini
    Arabidopsis Information Resource, Phoenix Bioinformatics, Redwood City, CA 94063, USA.
  • Valerie Wood
    PomBase, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Sanger Building, Tennis Court Road, Cambridge, UK.
  • Harold Drabkin
    The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
  • Petra Fey
    dictyBase, Biomedical Informatics Center and Center for Genetic Medicine, Northwestern University, Feinberg School of Medicine, North Lake Shore Drive, Chicago, IL, USA.
  • Penelope Garmiri
    European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
  • Midori A Harris
    PomBase, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Sanger Building, Tennis Court Road, Cambridge, UK.
  • Tony Sawford
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
  • Leonore Reiser
    The Arabidopsis Information Resource, Phoenix Bioinformatics, Redwood City, CA, USA.
  • Rebecca Tauber
    Evidence and Conclusion Ontology, University of Maryland School of Medicine, W Baltimore St., Baltimore, MD, USA.
  • Sabrina Toro
    Zebrafish Information Network, University of Oregon, Eugene, OR, USA.