Sculpting molecules in text-3D space: a flexible substructure aware framework for text-oriented molecular optimization.

Journal: BMC bioinformatics
PMID:

Abstract

The integration of deep learning, particularly AI-Generated Content, with high-quality data derived from ab initio calculations has emerged as a promising avenue for transforming the landscape of scientific research. However, the challenge of designing molecular drugs or materials that incorporate multi-modality prior knowledge remains a critical and complex undertaking. Specifically, achieving a practical molecular design necessitates not only meeting the diversity requirements but also addressing structural and textural constraints with various symmetries outlined by domain experts. In this article, we present an innovative approach to tackle this inverse design problem by formulating it as a multi-modality guidance optimization task. Our proposed solution involves a textural-structure alignment symmetric diffusion framework for the implementation of molecular optimization tasks, namely 3DToMolo. 3DToMolo aims to harmonize diverse modalities including textual description features and graph structural features, aligning them seamlessly to produce molecular structures adhere to specified symmetric structural and textural constraints by experts in the field. Experimental trials across three guidance optimization settings have shown a superior hit optimization performance compared to state-of-the-art methodologies. Moreover, 3DToMolo demonstrates the capability to discover potential novel molecules, incorporating specified target substructures, without the need for prior knowledge. This work not only holds general significance for the advancement of deep learning methodologies but also paves the way for a transformative shift in molecular design strategies. 3DToMolo creates opportunities for a more nuanced and effective exploration of the vast chemical space, opening new frontiers in the development of molecular entities with tailored properties and functionalities.

Authors

  • Kaiwei Zhang
    Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100085, China.
  • Yange Lin
    Huawei Technologies, Shenzhen, China.
  • Guangcheng Wu
    Department of Chemistry, The University of Hong Kong, Hong Kong SAR, 999077, China.
  • Yuxiang Ren
    School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, China.
  • Xuecang Zhang
    Huawei Technologies, Shenzhen, China.
  • Bo Wang
    Department of Clinical Laboratory Medicine Center, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia, China.
  • Xiao-Yu Zhang
  • Weitao Du
    Huawei Technologies, Shenzhen, China. duweitao@mail.ustc.edu.cn.