Deep learning of immune cell differentiation.

Journal: Proceedings of the National Academy of Sciences of the United States of America
Published Date:

Abstract

Although we know many sequence-specific transcription factors (TFs), how the DNA sequence of cis-regulatory elements is decoded and orchestrated on the genome scale to determine immune cell differentiation is beyond our grasp. Leveraging a granular atlas of chromatin accessibility across 81 immune cell types, we asked if a convolutional neural network (CNN) could learn to infer cell type-specific chromatin accessibility solely from regulatory DNA sequences. With a tailored architecture and an ensemble approach to CNN parameter interpretation, we show that our trained network ("AI-TAC") does so by rediscovering ab initio the binding motifs for known regulators and some unknown ones. Motifs whose importance is learned virtually as functionally important overlap strikingly well with positions determined by chromatin immunoprecipitation for several TFs. AI-TAC establishes a hierarchy of TFs and their interactions that drives lineage specification and also identifies stage-specific interactions, like Pax5/Ebf1 vs. Pax5/Prdm1, or the role of different NF-κB dimers in different cell types. AI-TAC assigns Spi1/Cebp and Pax5/Ebf1 as the drivers necessary for myeloid and B lineage fates, respectively, but no factors seemed as dominantly required for T cell differentiation, which may represent a fall-back pathway. Mouse-trained AI-TAC can parse human DNA, revealing a strikingly similar ranking of influential TFs and providing additional support that AI-TAC is a generalizable regulatory sequence decoder. Thus, deep learning can reveal the regulatory syntax predictive of the full differentiative complexity of the immune system.

Authors

  • Alexandra Maslova
    Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
  • Ricardo N Ramirez
    Department of Immunology, Harvard Medical School, Boston, MA 02115.
  • Ke Ma
    Shanghai Key Laboratory of Crime Scene Evidence, Shanghai Research Institute of Criminal Science and Technology Shanghai 200083 China yangfyhit@sina.com +86 021 22028363 +86 021 22028362.
  • Hugo Schmutz
    Department of Immunology, Harvard Medical School, Boston, MA 02115.
  • Chendi Wang
    Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
  • Curtis Fox
    Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
  • Bernard Ng
    Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
  • Christophe Benoist
    Department of Immunology, Harvard Medical School, Boston, MA 02115; cb@hms.harvard.edu saram@stat.ubc.ca.
  • Sara Mostafavi
    Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; cb@hms.harvard.edu saram@stat.ubc.ca.