Out-of-the-box deep learning prediction of quantum-mechanical partial charges by graph representation and transfer learning.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Accurate prediction of atomic partial charges with high-level quantum mechanics (QM) methods suffers from high computational cost. Numerous feature-engineered machine learning (ML)-based predictors with favorable computability and reliability have been developed as alternatives. However, extensive expertise effort was needed for feature engineering of atom chemical environment, which may consequently introduce domain bias. In this study, SuperAtomicCharge, a data-driven deep graph learning framework, was proposed to predict three important types of partial charges (i.e. RESP, DDEC4 and DDEC78) derived from high-level QM calculations based on the structures of molecules. SuperAtomicCharge was designed to simultaneously exploit the 2D and 3D structural information of molecules, which was proved to be an effective way to improve the prediction accuracy of the model. Moreover, a simple transfer learning strategy and a multitask learning strategy based on self-supervised descriptors were also employed to further improve the prediction accuracy of the proposed model. Compared with the latest baselines, including one GNN-based predictor and two ML-based predictors, SuperAtomicCharge showed better performance on all the three external test sets and had better usability and portability. Furthermore, the QM partial charges of new molecules predicted by SuperAtomicCharge can be efficiently used in drug design applications such as structure-based virtual screening, where the predicted RESP and DDEC4 charges of new molecules showed more robust scoring and screening power than the commonly used partial charges. Finally, two tools including an online server (http://cadd.zju.edu.cn/deepchargepredictor) and the source code command lines (https://github.com/zjujdj/SuperAtomicCharge) were developed for the easy access of the SuperAtomicCharge services.

Authors

  • Dejun Jiang
    Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China.
  • Huiyong Sun
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.
  • Jike Wang
    School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China.
  • Chang-Yu Hsieh
    Tencent Quantum Laboratory, Tencent, Shenzhen 518057 Guangdong, P. R. China.
  • Yuquan Li
    College of Chemistry and Chemical Engineering at Lanzhou University.
  • Zhenxing Wu
  • Dongsheng Cao
    School of Pharmaceutical Sciences, Central South University, Changsha, China. oriental-cds@163.com.
  • Jian Wu
    Department of Medical Technology, Jiangxi Medical College, Shangrao, Jiangxi, China.
  • Tingjun Hou
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.