Predicting expression-altering promoter mutations with deep learning.

Journal: Science (New York, N.Y.)
Published Date:

Abstract

Only a minority of patients with rare genetic diseases are currently diagnosed by exome sequencing, suggesting that additional unrecognized pathogenic variants may reside in non-coding sequence. Here, we describe PromoterAI, a deep neural network that accurately identifies non-coding promoter variants which dysregulate gene expression. We show that promoter variants with predicted expression-altering consequences produce outlier expression at both RNA and protein levels in thousands of individuals, and that these variants experience strong negative selection in human populations. We observe that clinically relevant genes in rare disease patients are enriched for such variants and validate their functional impact through reporter assays. Our estimates suggest that promoter variation accounts for 6% of the genetic burden associated with rare diseases.

Authors

  • Kishore Jaganathan
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Nicole Ersaro
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Gherman Novakovsky
    Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada.
  • Yuchuan Wang
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Terena James
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Jeremy Schwartzentruber
    Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
  • Petko Fiziev
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Irfahan Kassam
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Fan Cao
    Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Dr, Singapore, 117599, Singapore.
  • Johann Hawe
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Henry Cavanagh
    Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London, SW7 2BU, UK.
  • Ashley Lim
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Grace Png
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Jeremy McRae
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Abhimanyu Banerjee
    Physics Department, Stanford University, Stanford, CA, USA.
  • Arvind Kumar
    International Rice Research Institute, Los Baños, Philippines.
  • Jacob Ulirsch
    Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Yan Zhang
    Affiliated Hospital of Liaoning University of Traditional Chinese Medicine, Shenyang, 110032, China.
  • Francois Aguet
    Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Pierrick Wainschtein
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Laksshman Sundaram
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Adriana Salcedo
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Sofia Kyriazopoulou Panagiotopoulou
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Delasa Aghamirzaie
    Genetics, Bioinformatics and Computational Biology, Virginia Polytechnic Institute and State University Blacksburg, VA, USA.
  • Evin Padhi
    Department of Pathology, Stanford University, Stanford, CA, USA.
  • Ziming Weng
    Department of Pathology, Stanford University, Stanford, CA, USA.
  • Shan Dong
    Haidian Maternal & Child Health Hospital Nutrition Clinic, Beijing 100080, China.
  • Damian Smedley
    School of ITEE, The University of Queensland, St. Lucia, QLD 4072, Australia, Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia, Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, National Institute of Informatics, Hitotsubashi, Tokyo, Japan, Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK, LASIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal, Genetic Services of Western Australia, King Edward Memorial Hospital, WA 6008, Australia, School of Paediatrics and Child Health, University of Western Australia, WA 6008, Australia, Institute for Immunology and Infectious Diseases, Murdoch University, WA 6150, Australia, Office of Population Health, Public Health and Clinical Services Division, Western Australian Department of Health, WA 6004, Australia, Academic Department of Medical Genetics, Sydney Children's Hospitals Network (Westmead), NSW 2145, Australia, Discipline of Genetic Medicine, Sydney Medical School, The University of Sydney, NSW 2006, Australia, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany, Institute for Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany and Berlin Brandenburg Center for Regenerative Therapies, 13353 Berlin, Germany.
  • Mark Caulfield
    William Harvey Research Institute, Queen Mary University of London, London, UK.
  • Anne O'Donnell-Luria
    Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Heidi L Rehm
    Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Stephan J Sanders
    Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, 94143, United States.
  • Anshul Kundaje
    Department of Computer Science, Stanford University, Stanford, CA, USA.
  • Stephen B Montgomery
    Department of Genetics, Stanford University, Stanford, CA 94305, USA.
  • Mark T Ross
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.
  • Kyle Kai-How Farh
    Illumina Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA.

Keywords

No keywords available for this article.