Optimizing Graph Neural Networks for Jet Tagging in Particle Physics on FPGAs

This work proposes a novel reconfigurable architecture for reducing the latency of JEDI-net, a Graph Neural Network (GNN) based algorithm for jet tagging in particle physics, which achieves state-of-the-art accuracy. Accelerating JEDI-net is challenging since it requires low latency to deploy the network for event selection at the CERN Large Hadron Collider. This paper proposes an outer-product based matrix multiplication approach customized for GNN-based JEDI-net, which increases data spatial locality and reduces design latency. It is further enhanced by code transformation with strength reduction which exploits sparsity patterns and binary adjacency matrices to increase hardware efficiency while reducing latency. In addition, a customizable template for this architecture has been designed and open-sourced, which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools. Evaluation results show that our FPGA implementation is up to 9.5 times faster and consumes up to 6.5 times less power than a GPU implementation. Moreover, the throughput of our FPGA design is sufficiently high to enable deployment of JEDI-net in a sub-microsecond, real-time collider trigger system, enabling it to benefit from improved accuracy.

[1]  W. Luk,et al.  Reconfigurable Acceleration of Graph Neural Networks for Jet Identification in Particle Physics , 2022, 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS).

[2]  Carl E. Busart,et al.  Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA , 2022, 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[3]  Viktor Prasanna,et al.  HP-GNN: Generating High Throughput GNN Training Implementation on CPU-FPGA Heterogeneous Platform , 2021, FPGA.

[4]  M. Neubauer,et al.  Graph Neural Networks for Charged Particle Tracking on FPGAs , 2021, Frontiers in Big Data.

[5]  Haoran You,et al.  I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization , 2021, MICRO.

[6]  Viktor K. Prasanna,et al.  GCN Inference Acceleration using High-Level Synthesis , 2021, 2021 IEEE High Performance Extreme Computing Conference (HPEC).

[7]  Baole Ai,et al.  Graph Sampling with Fast Random Walker on HBM-enabled FPGA Accelerators , 2021, 2021 31st International Conference on Field-Programmable Logic and Applications (FPL).

[8]  Wayne Luk,et al.  Accelerating Recurrent Neural Networks for Gravitational Wave Experiments , 2021, 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[9]  M. Spiropulu,et al.  Performance of a geometric deep learning pipeline for HL-LHC particle tracking , 2021, The European Physical Journal C.

[10]  Martin Langhammer,et al.  Stratix 10 NX Architecture and Applications , 2021, FPGA.

[11]  Viktor K. Prasanna,et al.  BoostGCN: A Framework for Optimizing GCN Inference on FPGA , 2021, 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[12]  Emmanuelle Perez,et al.  The Phase-2 Upgrade of the CMS Level-1 Trigger , 2020 .

[13]  Wen-mei W. Hwu,et al.  DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator , 2020, 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD).

[14]  S. Jindariani,et al.  Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics , 2020, Frontiers in Big Data.

[15]  Adrian Alan Pol,et al.  Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors , 2020, Nature Machine Intelligence.

[16]  Zhiqiang Que,et al.  High-Throughput Convolutional Neural Network on an FPGA by Customized JPEG Compression , 2020, 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[17]  Javier Duarte,et al.  HLS4ML LHC Jet dataset (100 particles) , 2020 .

[18]  M. Pierini,et al.  HLS4ML LHC Jet dataset (150 particles) , 2020 .

[19]  Viktor Prasanna,et al.  GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms , 2019, FPGA.

[20]  Peter Zipf,et al.  Unrolling Ternary Neural Networks , 2019, ACM Trans. Reconfigurable Technol. Syst..

[21]  Antonino Tumeo,et al.  AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing , 2019, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Eric A. Moreno,et al.  JEDI-net: a jet identification algorithm based on interaction networks , 2019, The European Physical Journal C.

[23]  Yutaro Iiyama,et al.  Learning representations of irregular particle-detector geometry with distance-weighted graph networks , 2019, The European Physical Journal C.

[24]  Song Han,et al.  Fast inference of deep neural networks in FPGAs for particle physics , 2018, Journal of Instrumentation.

[25]  Song Han,et al.  ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.

[26]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[27]  Michael Ferdman,et al.  Maximizing CNN accelerator efficiency through resource partitioning , 2016, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).