Enhancing Butterfly Fat Tree NoCs for FPGAs with Lightweight Flow Control

FPGA overlay networks-on-chip (NoCs) based on Butterfly Fat Tree (BFT) topology and lightweight flow control can outperform state-of-the-art FPGA NoCs, such as Hoplite and others, on metrics such as throughput, latency, cost and power efficiency, and features such as in-order delivery and bounded packet delivery times. On one hand, lightweight FPGA NoCs built on the principle of bufferless deflection routing, such as Hoplite, can deliver low-LUT-cost implementations but sacrifice crucial features such as in-order delivery, livelock freedom, and bounds on delivery times. On the other hand, capable conventional NoCs like CONNECT provide these features but are significantly more expensive in LUT cost. Butterfly Fat Trees with lightweight flow control can deliver these features at medium cost while providing bandwidth configuration flexibility to the developer. We design FPGA-friendly routers with (1) latency-insensitive interfaces, coupled with (2) deterministic routing policy, and (3) round-robin scheduling at NoC ports to develop switches that take 311-375 LUTs/router. We evaluate our NoC under various conditions including synthetic and real-world workloads to deliver resource-proportional throughput and latency wins over competing NoCs, while significantly improving dynamic power consumption when compared to deflection-routed NoCs. We also explore the bandwidth customizability of the BFT organization to identify best NoC configurations for resource-constrained and application-requirement constrained scenarios.

[1]  Nachiket Kapre,et al.  Hoplite: Building austere overlay NoCs for FPGAs , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[2]  Wei Zhang,et al.  Floorplan Optimization of Fat-Tree-Based Networks-on-Chip for Chip Multiprocessors , 2014, IEEE Transactions on Computers.

[3]  André DeHon,et al.  Compact, multilayer layout for butterfly fat-tree , 2000, SPAA '00.

[4]  Nachiket Kapre,et al.  Packet Switched vs. Time Multiplexed FPGA Overlay Networks , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[5]  Wei Zhang,et al.  A low-power fat tree-based optical Network-On-Chip for multiprocessor system-on-chip , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[6]  Chris Fallin,et al.  CHIPPER: A low-complexity bufferless deflection router , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[7]  Brian Lebiednik,et al.  A Survey and Evaluation of Data Center Network Topologies , 2016, ArXiv.

[8]  Theo Ungerer,et al.  Minimally buffered deflection routing with in-order delivery in a torus , 2017, 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS).

[9]  Vaughn Betz,et al.  Networks-on-Chip for FPGAs: Hard, Soft or Mixed? , 2014, TRETS.

[10]  Nachiket Kapre,et al.  Deflection-routed butterfly fat trees on FPGAs , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[11]  John Wawrzynek,et al.  Design automation for streaming systems , 2005 .

[12]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[13]  Hong Liu,et al.  Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network , 2015, Comput. Commun. Rev..

[14]  André DeHon,et al.  Case for Fast FPGA Compilation Using Partial Reconfiguration , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[15]  Partha Pratim Pande,et al.  Structured interconnect architecture: a solution for the non-scalability of bus-based SoCs , 2004, GLSVLSI '04.

[16]  André DeHon,et al.  FPGA optimized packet-switched NoC using split and merge primitives , 2012, 2012 International Conference on Field-Programmable Technology.

[17]  Tushar Krishna,et al.  FastTrack: Leveraging Heterogeneous FPGA Wires to Design Low-Cost High-Performance Soft NoCs , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[18]  A. Bouhraoua,et al.  An Efficient Network-on-Chip Architecture Based on the Fat-Tree (FT) Topology , 2006, 2006 International Conference on Microelectronics.

[19]  James C. Hoe,et al.  CONNECT: re-examining conventional wisdom for designing nocs in the context of FPGAs , 2012, FPGA '12.

[20]  Hari Angepat,et al.  Configurable Clouds , 2017, IEEE Micro.

[21]  Roy L. Russo,et al.  On a Pin Versus Block Relationship For Partitions of Logic Graphs , 1971, IEEE Transactions on Computers.

[22]  Pedro López,et al.  Deterministic versus Adaptive Routing in Fat-Trees , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.