Small molecule generation via disentangled representation learning.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Expanding our knowledge of small molecules beyond what is known in nature or designed in wet laboratories promises to significantly advance cheminformatics, drug discovery, biotechnology and material science. In silico molecular design remains challenging, primarily due to the complexity of the chemical space and the non-trivial relationship between chemical structures and biological properties. Deep generative models that learn directly from data are intriguing, but they have yet to demonstrate interpretability in the learned representation, so we can learn more about the relationship between the chemical and biological space. In this article, we advance research on disentangled representation learning for small molecule generation. We build on recent work by us and others on deep graph generative frameworks, which capture atomic interactions via a graph-based representation of a small molecule. The methodological novelty is how we leverage the concept of disentanglement in the graph variational autoencoder framework both to generate biologically relevant small molecules and to enhance model interpretability.

Authors

  • Yuanqi Du
    Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
  • Xiaojie Guo
    State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China.
  • Yinkai Wang
    Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
  • Amarda Shehu
    1 Department of Computer Science, George Mason University , Fairfax, Virginia.
  • Liang Zhao
    Graduate School of Advanced Integrated Studies in Human Survivability (Shishu-Kan), Kyoto University, Kyoto, Japan.