Applications of machine learning in phylogenetics.

Journal: Molecular phylogenetics and evolution
PMID:

Abstract

Machine learning has increasingly been applied to a wide range of questions in phylogenetic inference. Supervised machine learning approaches that rely on simulated training data have been used to infer tree topologies and branch lengths, to select substitution models, and to perform downstream inferences of introgression and diversification. Here, we review how researchers have used several promising machine learning approaches to make phylogenetic inferences. Despite the promise of these methods, several barriers prevent supervised machine learning from reaching its full potential in phylogenetics. We discuss these barriers and potential paths forward. In the future, we expect that the application of careful network designs and data encodings will allow supervised machine learning to accommodate the complex processes that continue to confound traditional phylogenetic methods.

Authors

  • Yu K Mo
    Department of Computer Science, Indiana University, Bloomington, IN 47405, USA.
  • Matthew W Hahn
    Department of Computer Science, Indiana University, Bloomington, IN 47405, USA; Department of Biology, Indiana University, Bloomington, IN 47405, USA.
  • Megan L Smith
    Department of Biology, Indiana University, Bloomington, Indiana, USA.