Can One Hear the Shape of a Molecule (from its Coulomb Matrix Eigenvalues)?
Journal:
Journal of chemical information and modeling
Published Date:
Jul 29, 2020
Abstract
Coulomb matrix eigenvalues (CMEs) are global 3D representations of molecular structure, which have been previously used to predict atomization energies, prioritize geometry searches, and interpret rotational spectra. The properties of the CME representation and its relationship to molecular structure are established using the Gershgorin circle theorem. Numerical bounds are studied using a data set of 309 000 conformational samples of all constitutional isomers of acyclic alkanes, CH, from methane ( = 1) to undecane ( = 11), to establish the extent to which the CME preserves chemical intuitions about isomer and conformer similarity and its ability to distinguish constitutional isomers. Neither supervised nor unsupervised machine-learning algorithms can perfectly distinguish constitutional isomers as the molecular size increases, but the misclassification rate can be kept below 1%.