Relating enhancer genetic variation across mammals to complex phenotypes using machine learning.

Journal: Science (New York, N.Y.)
PMID:

Abstract

Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.

Authors

  • Irene M Kaplow
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States.
  • Alyssa J Lawler
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States.
  • Daniel E Schäffer
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Chaitanya Srinivasan
    Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Heather H Sestili
    Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Morgan E Wirthlin
    Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
  • BaDoi N Phan
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States.
  • Kavya Prasad
    Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Ashley R Brown
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States.
  • Xiaomeng Zhang
    Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China.
  • Kathleen Foley
    Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA.
  • Diane P Genereux
    Broad Institute, Cambridge, MA, USA.
  • Elinor K Karlsson
    Broad Institute, Cambridge, MA, USA.
  • Kerstin Lindblad-Toh
    Broad Institute, Cambridge, MA, USA.
  • Wynn K Meyer
    Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA.
  • Andreas R Pfenning
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States.