Comprehensive Study on Molecular Supervised Learning with Graph Neural Networks.

Journal: Journal of chemical information and modeling

Published Date: Nov 8, 2020

Abstract

This work considers strategies to develop accurate and reliable graph neural networks (GNNs) for molecular property predictions. Prediction performance of GNNs is highly sensitive to the change in various parameters due to the inherent challenges in molecular machine learning, such as a deficient amount of data samples and bias in data distribution. Comparative studies with well-designed experiments are thus important to clearly understand which GNNs are powerful for molecular supervised learning. Our work presents a number of ablation studies along with a guideline to train and utilize GNNs for both molecular regression and classification tasks. First, we validate that using both atomic and bond meta-information improves the prediction performance in the regression task. Second, we find that the graph isomorphism hypothesis proposed by [Xu, K. How powerful are graph neural networks? 2018, arXiv:1810.00826. arXiv.org e-Print archive. https://arxiv.org/abs/1810.00826] is valid for the regression task. Surprisingly, however, the findings above do not hold for the classification tasks. Beyond the study on model architectures, we test various regularization methods and Bayesian learning algorithms to find the best strategy to achieve a reliable classification system. We demonstrate that regularization methods penalizing predictive entropy might not give well-calibrated probability estimation, even though they work well in other domains, and Bayesian learning methods are capable of developing reliable prediction systems. Furthermore, we argue the importance of Bayesian learning in virtual screening by showing that well-calibrated probability estimation may lead to a higher success rate.

Authors

Doyeong Hwang

AITRICS, Hyoryoung-ro 77-gil, Seocho-gu, Seoul, Republic of Korea.
Soojung Yang

AITRICS, Hyoryoung-ro 77-gil, Seocho-gu, Seoul, Republic of Korea.
Yongchan Kwon

Department of Biomedical Data Science, Stanford University, Stanford, California 94305, United States.
Kyung Hoon Lee

Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea.
Grace Lee

AITRICS, Hyoryoung-ro 77-gil, Seocho-gu, Seoul, Republic of Korea.
Hanseok Jo

AITRICS, Hyoryoung-ro 77-gil, Seocho-gu, Seoul, Republic of Korea.
Seyeol Yoon

AITRICS, Hyoryoung-ro 77-gil, Seocho-gu, Seoul, Republic of Korea.
Seongok Ryu

Department of Chemistry , KAIST , Daejeon 34141 , South Korea.

Keywords

Algorithms Bayes Theorem Machine Learning Neural Networks, Computer Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (33164522)

Comprehensive Study on Molecular Supervised Learning with Graph Neural Networks.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals