Acceleration of Graph Neural Network-Based Prediction Models in Chemistry via Co-Design Optimization on Intelligence Processing Units.

Journal: Journal of chemical information and modeling
PMID:

Abstract

Atomic structure prediction and associated property calculations are the bedrock of chemical physics. Since high-fidelity ab initio modeling techniques for computing the structure and properties can be prohibitively expensive, this motivates the development of machine-learning (ML) models that make these predictions more efficiently. Training graph neural networks over large atomistic databases introduces unique computational challenges, such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs, such as social networks. We demonstrate a novel hardware-software codesign approach to scale up the training of atomistic graph neural networks (GNN) for structure and property prediction. First, to eliminate redundant computation and memory associated with alternative padding techniques and to improve throughput via minimizing communication, we formulate the effective coalescing of the batches of variable-size atomistic graphs as the bin packing problem and introduce a hardware-agnostic algorithm to pack these batches. In addition, we propose hardware-specific optimizations, including a planner and vectorization for the gather-scatter operations targeted for Graphcore's Intelligence Processing Unit (IPU), as well as model-specific optimizations such as merged communication collectives and optimized softplus. Putting these all together, we demonstrate the effectiveness of the proposed codesign approach by providing an implementation of a well-established atomistic GNN on the Graphcore IPUs. We evaluate the training performance on multiple atomistic graph databases with varying degrees of graph counts, sizes, and sparsity. We demonstrate that such a codesign approach can reduce the training time of atomistic GNNs and can improve their performance by up to 1.5× compared to the baseline implementation of the model on the IPUs. Additionally, we compare our IPU implementation with a Nvidia GPU-based implementation and show that our atomistic GNN implementation on the IPUs can run 1.8× faster on average compared to the execution time on the GPUs.

Authors

  • Hatem Helal
    Graphcore, Kett House, Station Rd, Cambridge CB1 2JH, U.K.
  • Jesun Firoz
    Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, 1100 Dexter Ave N, Seattle, Washington 98109, United States.
  • Jenna A Bilbrey
    Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA.
  • Henry Sprueill
    Artificial Intelligence and Data Analytics Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, Richland, Washington 99352, United States.
  • Kristina M Herman
    Department of Chemistry, University of Washington, Seattle, Washington 98185, United States.
  • Mario Michael Krell
    Robotics Research Group, University of Bremen, Robert-Hooke-Str. 1, Bremen, Germany.
  • Tom Murray
    Graphcore, Kett House, Station Rd, Cambridge CB1 2JH, U.K.
  • Manuel Lopez Roldan
    Graphcore, Kett House, Station Rd, Cambridge CB1 2JH, U.K.
  • Mike Kraus
    Graphcore, Kett House, Station Rd, Cambridge CB1 2JH, U.K.
  • Ang Li
    Section of Hematology-Oncology, Department of Medicine, Baylor College of Medicine, Houston, Texas; Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington. Electronic address: ang.li2@bcm.edu.
  • Payel Das
    IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. daspa@us.ibm.com.
  • Sotiris S Xantheas
    Department of Chemistry, University of Washington, Seattle, Washington 98185, United States.
  • Sutanay Choudhury
    Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, Richland, Washington 99352, United States.