Roughness of Molecular Property Landscapes and Its Impact on Modellability.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

In molecular discovery and drug design, structure-property relationships and activity landscapes are often qualitatively or quantitatively analyzed to guide the navigation of chemical space. The roughness (or smoothness) of these molecular property landscapes is one of their most studied geometric attributes, as it can characterize the presence of activity cliffs, with rougher landscapes generally expected to pose tougher optimization challenges. Here, we introduce a general, quantitative measure for describing the roughness of molecular property landscapes. The proposed roughness index (ROGI) is loosely inspired by the concept of fractal dimension and strongly correlates with the out-of-sample error achieved by machine learning models on numerous regression tasks.

Authors

  • Matteo Aldeghi
    Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 3H6, Canada; Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada.
  • David E Graff
    Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts02138, United States.
  • Nathan Frey
    Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, Massachusetts02421, United States.
  • Joseph A Morrone
    IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598USA.
  • Edward O Pyzer-Knapp
    IBM Research U.K. , Hartree Centre, Daresbury WA4 4AD , United Kingdom.
  • Kirk E Jordan
    IBM Thomas J. Watson Research Center, Cambridge, Massachusetts02142, United States.
  • Connor W Coley
    Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA 02139 USA whgreen@mit.edu kfjensen@mit.edu.