Fast activation maximization for molecular sequence design.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Optimization of DNA and protein sequences based on Machine Learning models is becoming a powerful tool for molecular design. Activation maximization offers a simple design strategy for differentiable models: one-hot coded sequences are first approximated by a continuous representation, which is then iteratively optimized with respect to the predictor oracle by gradient ascent. While elegant, the current version of the method suffers from vanishing gradients and may cause predictor pathologies leading to poor convergence.

Authors

  • Johannes Linder
    Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA.
  • Georg Seelig
    Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA; Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA. Electronic address: gseelig@uw.edu.