BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences.

Journal: Cell systems
Published Date:

Abstract

The design choices underlying machine-learning (ML) models present important barriers to entry for many biologists who aim to incorporate ML in their research. Automated machine-learning (AutoML) algorithms can address many challenges that come with applying ML to the life sciences. However, these algorithms are rarely used in systems and synthetic biology studies because they typically do not explicitly handle biological sequences (e.g., nucleotide, amino acid, or glycan sequences) and cannot be easily compared with other AutoML algorithms. Here, we present BioAutoMATED, an AutoML platform for biological sequence analysis that integrates multiple AutoML methods into a unified framework. Users are automatically provided with relevant techniques for analyzing, interpreting, and designing biological sequences. BioAutoMATED predicts gene regulation, peptide-drug interactions, and glycan annotation, and designs optimized synthetic biology components, revealing salient sequence characteristics. By automating sequence modeling, BioAutoMATED allows life scientists to incorporate ML more readily into their work.

Authors

  • Jacqueline A Valeri
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, 02115, USA.
  • Luis R Soenksen
    Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.
  • Katherine M Collins
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Brain & Cognitive Sciences and Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
  • Pradeep Ramesh
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, 02115, USA.
  • George Cai
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA.
  • Rani Powers
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Pluto Biosciences, Golden, CO 80402, USA.
  • Nicolaas M Angenent-Mari
    Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.
  • Diogo M Camacho
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA.
  • Felix Wong
    Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Timothy K Lu
    Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. timlu@mit.edu.
  • James J Collins
    Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.