Designing Eukaryotic Gene Expression Regulation Using Machine Learning.

Journal: Trends in biotechnology
Published Date:

Abstract

Controlling the expression of genes is one of the key challenges of synthetic biology. Until recently fine-tuned control has been out of reach, particularly in eukaryotes owing to their complexity of gene regulation. With advances in machine learning (ML) and in particular with increasing dataset sizes, models predicting gene expression levels from regulatory sequences can now be successfully constructed. Such models form the cornerstone of algorithms that allow users to design regulatory regions to achieve a specific gene expression level. In this review we discuss strategies for data collection, data encoding, ML practices, design algorithm choices, and finally model interpretation. Ultimately, these developments will provide synthetic biologists with highly specific genetic building blocks to rationally engineer complex pathways and circuits.

Authors

  • Ronald P H de Jongh
    Bioinformatics Group, Wageningen University and Research, Wageningen, The Netherlands; Laboratory of Systems and Synthetic Biology, Wageningen University and Research, The Netherlands.
  • Aalt D J van Dijk
    Biometris, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
  • Mattijs K Julsing
    Wageningen Food and Biobased Research, Wageningen University and Research, Wageningen, The Netherlands.
  • Peter J Schaap
    Laboratory of Systems and Synthetic Biology, Wageningen University and Research, The Netherlands.
  • Dick de Ridder
    Bioinformatics Group, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.