Uncovering the Mechanistic Landscape of Regulatory DNA with Deep Learning
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
The regulatory genome encodes the logic that governs gene expression, enabling cells to respond to developmental, environmental, and evolutionary cues. This logic arises from complex cis-regulatory mechanisms that integrate transcription factor motifs, their syntactical arrangement, and surrounding sequence context, features that remain challenging to decode. Here, we present SEAM (Systematic Explanation of Attribution-based Mechanisms), a computational framework that combines deep learning with explainable AI to map the mechanistic impact of genetic mutations. Applied to human and Drosophila regulatory loci, SEAM uncovers functional binding sites at sequences of interest and identifies which mutations preserve, disrupt, or create novel binding sites. SEAM also reveals that two qualitatively distinct classes of regulatory signal are operative at many loci: signals that are robust to mutation and signals that are readily reprogrammable. These results clarify the inherent ability of regulatory DNA to evolve. They also position SEAM as a versatile framework for interpreting non-coding variants and for informing the mechanism-aware design of synthetic sequences.