The message passing neural networks for chemical property prediction on SMILES.
Journal:
Methods (San Diego, Calif.)
Published Date:
May 21, 2020
Abstract
Drug metabolism is determined by the biochemical and physiological properties of the drug molecule. To improve the performance of a drug property prediction model, it is important to extract complex molecular dynamics from limited data. Recent machine learning or deep learning based models have employed the atom- and bond-type information, as well as the structural information to predict drug properties. However, many of these methods can be used only for the graph representations. Message passing neural networks (MPNNs) (Gilmer et al., 2017) is a framework used to learn both local and global features from irregularly formed data, and is invariant to permutations. This network performs an iterative message passing (MP) operation on each object and its neighbors, and obtain the final output from all messages regardless of their order. In this study, we applied the MP-based attention network (Nikolentzos et al., 2019) originally developed for text learning to perform chemical classification tasks. Before training, we tokenized the characters, and obtained embeddings of each molecular sequence. We conducted various experiments to maximize the predictivity of the model. We trained and evaluated our model using various chemical classification benchmark tasks. Our results are comparable to previous state-of-the-art and baseline models or outperform. To the best of our knowledge, this is the first attempt to learn chemical strings using an MP-based algorithm. We will extend our work to more complex tasks such as regression or generation tasks in the future.