Technique of Augmenting Molecular Graph Data by Perturbating Hidden Features.

Journal: Molecular informatics
Published Date:

Abstract

Quantitative structure-property relationship models are useful in efficiently searching for molecules with desired properties in drug discovery and materials development. In recent years, many such models based on graph neural networks, showing good prediction performance, have been reported. Training graph neural networks generally require many samples, but by using a training method for a small dataset, it is possible to extract features that enable successful prediction. Herein, we design a method of augmenting graph data. In this method, random perturbations are added with a certain probability to some vertex features during message passing. We verify the proposed method's effectiveness in regression and classification tasks. It is confirmed that the proposed method is effective when the perturbation is added immediately before the readout of the graph neural network, and the effect of the data augmentation is most evident for small datasets of approximately 1000 samples.

Authors

  • Takahiro Inoue
    Department of Gastrointestinal Oncology, Osaka International Cancer Institute, 3-1-69 Otemae, Chuo-ku, Osaka, 541-8567, Japan.
  • Kenichi Tanaka
    Division of Gastrointestinal Surgery, Department of Surgery, Graduate School of Medicine, Kobe University, Kobe, Japan. tanakake@med.kobe-u.ac.jp.
  • Kimito Funatsu
    The University of Tokyo, School of Engineering, Department of Chemical System Engineering, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656.