Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference

Real-time Deep Neural Network (DNN) inference with low-latency requirement has become increasingly important for numerous applications in both cloud computing (e.g., Apple’s Siri) and edge computin...