AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks
Journal:
arXiv
Published Date:
Feb 28, 2025
Abstract
Graph Neural Networks (GNNs) have recently gained attention due to their
performance on non-Euclidean data. The use of custom hardware architectures
proves particularly beneficial for GNNs due to their irregular memory access
patterns, resulting from the sparse structure of graphs. However, existing FPGA
accelerators are limited by their double buffering mechanism, which doesn't
account for the irregular node distribution in typical graph datasets. To
address this, we introduce \textbf{AMPLE} (Accelerated Message Passing Logic
Engine), an FPGA accelerator leveraging a new event-driven programming flow. We
develop a mixed-arithmetic architecture, enabling GNN inference to be quantized
at a node-level granularity. Finally, prefetcher for data and instructions is
implemented to optimize off-chip memory access and maximize node parallelism.
Evaluation on citation and social media graph datasets ranging from $2$K to
$700$K nodes showed a mean speedup of $243\times$ and $7.2\times$ against CPU
and GPU counterparts, respectively.