Safely Preventing Unbounded Delays During Bus Transactions in FPGA-based SoC

Advanced eXtensible Interface (AXI) is an open-standard communication bus interface implemented in most commercial off-the-shelf FPGA System-on-Chips (SoC) to exchange data within the chip. Unfortunately, the AXI standard does not mandate any mechanism to detect possible misbehavior of the connected modules. This work shows that this lack of specification has a relevant impact on popular implementations of the AXI bus. In particular, it is shown how it is easily possible to inject arbitrarily-long delays on modern FPGA system-on-chips under the presence of misbehaving bus masters. To safely solve this issue, this paper presents a general timing analysis to bound the execution of periodically-invoked hardware accelerators in nominal conditions. This timing analysis is then used to conFigure a latency-free hardware module named AXI Stall Monitor (ASM), also proposed in this paper, capable of detecting and safely solving possible stalls during AXI bus transactions. The ASM leaves a quantified flexibility to the hardware accelerators when deviating from nominal conditions. The contribution is finally supported by a set of experiments on the Zynq-7000 and Zynq Ultrascale+SoCs by Xilinx.

[1]  Rodolfo Pellizzoni,et al.  PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[2]  Guy Lemieux,et al.  TinBiNN: Tiny Binarized Neural Network Overlay in about 5, 000 4-LUTs and 5mW , 2019, ArXiv.

[3]  Philip Heng Wai Leong,et al.  FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.

[4]  Giorgio C. Buttazzo,et al.  A Framework for Supporting Real-Time Applications on Dynamic Reconfigurable FPGAs , 2016, 2016 IEEE Real-Time Systems Symposium (RTSS).

[5]  Miroslaw Malek,et al.  Survey of software tools for evaluating reliability, availability, and serviceability , 1988, CSUR.

[6]  Ganesh Gopalakrishnan,et al.  GPU Concurrency: Weak Behaviours and Programming Assumptions , 2015, ASPLOS.

[7]  Lui Sha,et al.  Memory Bandwidth Management for Efficient Performance Isolation in Multi-Core Platforms , 2016, IEEE Transactions on Computers.

[8]  Lui Sha,et al.  Handling mixed-criticality in SoC-based real-time embedded systems , 2009, EMSOFT '09.

[9]  Marco Pagani,et al.  Is Your Bus Arbiter Really Fair? Restoring Fairness in AXI Interconnects for FPGA SoCs , 2019, ACM Trans. Embed. Comput. Syst..

[10]  Rajesh Gupta,et al.  Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.

[11]  Yun Liang,et al.  REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs , 2019, FPGA.

[12]  Nikil D. Dutt,et al.  Integrating Physical Constraints in HW-SW Partitioning for Architectures With Partial Dynamic Reconfiguration , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Luca Benini,et al.  GPUguard: Towards supporting a predictable execution model for heterogeneous SoC , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[14]  Luigi Pomante,et al.  Hardware performance sniffers for embedded systems profiling , 2015, 2015 12th International Workshop on Intelligent Solutions in Embedded Systems (WISES).

[15]  Mohamed Hassan,et al.  Bounding DRAM Interference in COTS Heterogeneous MPSoCs for Mixed Criticality Systems , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[16]  Greg Brown,et al.  A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications , 2012, FPGA '12.

[17]  Rodolfo Pellizzoni,et al.  HopliteBuf: FPGA NoCs with Provably Stall-Free FIFOs , 2019, FPGA.

[18]  Tarek A. El-Ghazawi,et al.  The Promise of High-Performance Reconfigurable Computing , 2008, Computer.

[19]  Jürgen Becker,et al.  Leveraging the Partial Reconfiguration Capability of FPGAs for Processor-Based Fail-Operational Systems , 2019, ARC.

[20]  Kia Bazargan,et al.  Energy-Efficient Convolutional Neural Networks with Deterministic Bit-Stream Processing , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[21]  Tae-Jin Kim,et al.  Design and implementation of Performance Analysis Unit (PAU) for AXI-based multi-core System on Chip (SOC) , 2010, Microprocess. Microsystems.