Discovering genotype-phenotype relationships with machine learning and the Visual Physiology Opsin Database (VPOD).

Journal: GigaScience
PMID:

Abstract

BACKGROUND: Predicting phenotypes from genetic variation is foundational for fields as diverse as bioengineering and global change biology, highlighting the importance of efficient methods to predict gene functions. Linking genetic changes to phenotypic changes has been a goal of decades of experimental work, especially for some model gene families, including light-sensitive opsin proteins. Opsins can be expressed in vitro to measure light absorption parameters, including λmax-the wavelength of maximum absorbance-which strongly affects organismal phenotypes like color vision. Despite extensive research on opsins, the data remain dispersed, uncompiled, and often challenging to access, thereby precluding systematic and comprehensive analyses of the intricate relationships between genotype and phenotype.

Authors

  • Seth A Frazer
    Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, California 93106, USA.
  • Mahdi Baghbanzadeh
    Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.
  • Ali Rahnavard
    Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA.
  • Keith A Crandall
    Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, 20052, USA.
  • Todd H Oakley
    Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, California 93106, USA.