Quantum-Informed Molecular Representation Learning Enhancing ADMET Property Prediction.

Journal: Journal of chemical information and modeling
PMID:

Abstract

We examined pretraining tasks leveraging abundant labeled data to effectively enhance molecular representation learning in downstream tasks, specifically emphasizing graph transformers to improve the prediction of ADMET properties. Our investigation revealed limitations in previous pretraining tasks and identified more meaningful training targets, ranging from 2D molecular descriptors to extensive quantum chemistry simulations. These data were seamlessly integrated into supervised pretraining tasks. The implementation of our pretraining strategy and multitask learning outperforms conventional methods, achieving state-of-the-art outcomes in 7 out of 22 ADMET tasks within the Therapeutics Data Commons by utilizing a shared encoder across all tasks. Our approach underscores the effectiveness of learning molecular representations and highlights the potential for scalability when leveraging extensive data sets, marking a significant advancement in this domain.

Authors

  • Jungwoo Kim
    Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea.
  • Woojae Chang
    Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea.
  • Hyunjun Ji
    Standigm Inc., 182 Dogok-ro, 6F, Gangnam-gu, Seoul 06261, Korea.
  • InSuk Joung
    Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul, South Korea.