GOATOOLS: A Python library for Gene Ontology analyses.

Journal: Scientific reports
PMID:

Abstract

The biological interpretation of gene lists with interesting shared properties, such as up- or down-regulation in a particular experiment, is typically accomplished using gene ontology enrichment analysis tools. Given a list of genes, a gene ontology (GO) enrichment analysis may return hundreds of statistically significant GO results in a "flat" list, which can be challenging to summarize. It can also be difficult to keep pace with rapidly expanding biological knowledge, which often results in daily changes to any of the over 47,000 gene ontologies that describe biological knowledge. GOATOOLS, a Python-based library, makes it more efficient to stay current with the latest ontologies and annotations, perform gene ontology enrichment analyses to determine over- and under-represented terms, and organize results for greater clarity and easier interpretation using a novel GOATOOLS GO grouping method. We performed functional analyses on both stochastic simulation data and real data from a published RNA-seq study to compare the enrichment results from GOATOOLS to two other popular tools: DAVID and GOstats. GOATOOLS is freely available through GitHub: https://github.com/tanghaibao/goatools .

Authors

  • D V Klopfenstein
    School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA.
  • Liangsheng Zhang
    State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural Universitygrid.35155.37, Wuhan, China.
  • Brent S Pedersen
    Department of Human Genetics, University of Utah, Salt Lake City, UT 84105, USA; Department of Biomedical Informatics, University of Utah, Salt Lake City, UT 84105, USA; USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84105, USA.
  • Fidel Ramírez
    Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany.
  • Alex Warwick Vesztrocy
    Department of Genetics, Evolution and Environment, University College London, London, UK.
  • Aurélien Naldi
    Lifeware Group, Inria, Saclay-île de France, Palaiseau, France.
  • Christopher J Mungall
    Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
  • Jeffrey M Yunes
    UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA.
  • Olga Botvinnik
    Bioinformatics and Systems Biology Program, University of California, San Diego, CA, USA.
  • Mark Weigel
    Independent Researcher, Philadelphia, PA, USA.
  • Will Dampier
    School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA.
  • Christophe Dessimoz
    Department of Genetics, Evolution and Environment, University College London, Gower St, London, WC1E 6BT, UK. Christophe.Dessimoz@unil.ch.
  • Patrick Flick
    School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
  • Haibao Tang
    Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, China. tanghaibao@gmail.com.