Graphormer-IR: Graph Transformers Predict Experimental IR Spectra Using Highly Specialized Attention.

Journal: Journal of chemical information and modeling

Published Date: Jun 6, 2024

Abstract

Infrared (IR) spectroscopy is an important analytical tool in various chemical and forensic domains and a great deal of effort has gone into developing methods for predicting experimental spectra. A key challenge in this regard is generating highly accurate spectra quickly to enable real-time feedback between computation and experiment. Here, we employ Graphormer, a graph neural network (GNN) transformer, to predict IR spectra using only simplified molecular-input line-entry system (SMILES) strings. Our data set includes 53,528 high-quality spectra, measured in five different experimental media (i.e., phases), for molecules containing the elements H, C, N, O, F, Si, S, P, Cl, Br, and I. When using only atomic numbers for node encodings, Graphormer-IR achieved a mean test spectral information similarity () value of 0.8449 ± 0.0012 ( = 5), which surpasses that the current state-of-the-art model Chemprop-IR ( = 0.8409 ± 0.0014, = 5) with only 36% of the encoded information. Augmenting node embeddings with additional node-level descriptors in learned embeddings generated through a multilayer perceptron improves scores to = 0.8523 ± 0.0006, a total improvement of 19.7σ ( = 19). These improved scores show how Graphormer-IR excels in capturing long-range interactions like hydrogen bonding, anharmonic peak positions in experimental spectra, and stretching frequencies of uncommon functional groups. Scaling our architecture to 210 attention heads demonstrates specialist-like behavior for distinct IR frequencies that improves model performance. Our model utilizes novel architectures, including a global node for phase encoding, learned node feature embeddings, and a one-dimensional (1D) smoothing convolutional neural network (CNN). Graphormer-IR's innovations underscore its value over traditional message-passing neural networks (MPNNs) due to its expressive embeddings and ability to capture long-range intramolecular relationships.

Authors

Cailum M K Stienstra

Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
Liam Hebert

Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
Patrick Thomas

Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
Alexander Haack

Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
Jason Guo

Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
W Scott Hopkins

Department of Chemistry, University of Waterloo, 200 University Avenue West, Waterloo, Ontario N2L 3G1, Canada. shopkins@uwaterloo.ca and Waterloo Institute for Nanotechnology, University of 200 University Avenue West, Waterloo, Ontario N2L 3G1, Canada and WaterMine Innovation, Inc., Waterloo, Ontario N0B 2T0, Canada and Centre for Eye and Vision Research, Hong Kong Science Park, New Territories, 999077, Hong Kong.

Keywords

Neural Networks, Computer Spectrophotometry, Infrared

External Resources

View on PubMed Access via DOI PubMed (38845400)

Graphormer-IR: Graph Transformers Predict Experimental IR Spectra Using Highly Specialized Attention.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Graphormer-IR: Graph Transformers Predict Experimental IR Spectra Using Highly Specialized Attention.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals