Machine Learning Strategies for Reaction Development: Toward the Low-Data Limit.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Machine learning models are increasingly being utilized to predict outcomes of organic chemical reactions. A large amount of reaction data is used to train these models, which is in stark contrast to how expert chemists discover and develop new reactions by leveraging information from a small number of relevant transformations. Transfer learning and active learning are two strategies that can operate in low-data situations, which may help fill this gap and promote the use of machine learning for tackling real-world challenges in organic synthesis. This Perspective introduces active and transfer learning and connects these to potential opportunities and directions for further research, especially in the area of prospective development of chemical transformations.

Authors

  • Eunjae Shim
    Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States.
  • Ambuj Tewari
    Department of Statistics , University of Michigan , 1085 South University Avenue , Ann Arbor , Michigan 48109 , United States.
  • Tim Cernak
    Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States.
  • Paul M Zimmerman
    Department of Chemistry , University of Michigan , 930 North University Avenue , Ann Arbor , Michigan 48109 , United States.