Transforming the study of organisms: Phenomic data models and knowledge bases.

Journal: PLoS computational biology
Published Date:

Abstract

The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.

Authors

  • Anne E Thessen
    Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America.
  • Ramona L Walls
    CyVerse, University of Arizona, Tucson, AZ 85721 USA.
  • Lars Vogt
    Rheinische Friedrich-Wilhelms-Universität Bonn, Institut für Evolutionsbiologie und Ökologie, An der Immenburg 1, 53121, Bonn, Germany. lars.m.vogt@gmail.com.
  • Jessica Singer
    Annex Agriculture Inc., Saskatchewan, Canada.
  • Robert Warren
    Annex Agriculture Inc., Saskatchewan, Canada.
  • Pier Luigi Buttigieg
    Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Bremerhaven, Germany.
  • James P Balhoff
    National Evolutionary Synthesis Center, Durham, NC 27705, USA; University of North Carolina, Chapel Hill, NC 27599, USA;
  • Christopher J Mungall
    Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
  • Deborah L McGuinness
    Rensselaer Polytechnic Institute, Troy, NY.
  • Brian J Stucky
    Florida Museum of Natural History, University of Florida, Gainesville, Florida, United States of America.
  • Matthew J Yoder
    Illinois Natural History Survey, Champaign, Illinois, United States of America.
  • Melissa A Haendel
    Library, Oregon Health & Science University, Portland, OR 97239, USA.