Design and Applications for Embedded Networks-on-Chip on FPGAs

Field-programmable gate-arrays (FPGAs) have evolved to include embedded memory, high-speed I/O interfaces and processors, making them both more efficient and easier-to-use for compute acceleration and networking applications. However, implementing on-chip communication is still a designer’s burden wherein custom system-level buses are implemented using the fine-grained FPGA logic and interconnect fabric. Instead, we propose augmenting FPGAs with an embedded network-on-chip (NoC) to implement system-level communication. We design custom interfaces to connect a packet-switched NoC to the FPGA fabric and I/Os in a configurable and efficient way and then define the necessary conditions to implement common FPGA design styles with an embedded NoC. Four application case studies highlight the advantages of using an embedded NoC. We show that access latency to external memory can be <inline-formula> <tex-math notation="LaTeX">$\sim$</tex-math><alternatives><inline-graphic xlink:href="abdelfattah-ieq1-2621045.gif"/> </alternatives></inline-formula>1.5<inline-formula><tex-math notation="LaTeX">$\times$</tex-math><alternatives> <inline-graphic xlink:href="abdelfattah-ieq2-2621045.gif"/></alternatives></inline-formula> lower. Our application case study with image compression shows that an embedded NoC improves frequency by 10-80%, reduces utilization of scarce long wires by 40% and makes design easier and more predictable. Additionally, we leverage the embedded NoC in creating a programmable Ethernet switch that can support up to 819 Gb/s-5<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math><alternatives><inline-graphic xlink:href="abdelfattah-ieq3-2621045.gif"/> </alternatives></inline-formula> more switching bandwidth and 3<inline-formula><tex-math notation="LaTeX">$\times$ </tex-math><alternatives><inline-graphic xlink:href="abdelfattah-ieq4-2621045.gif"/></alternatives></inline-formula> lower area compared to previous work. Finally, we design a 400 Gb/s NoC-based packet processor that is very flexible and more efficient than other FPGA-based packet processors.

[1]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[2]  Eric S. Chung,et al.  A reconfigurable fabric for accelerating large-scale datacenter services , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[3]  Vaughn Betz,et al.  Networks-on-Chip for FPGAs: Hard, Soft or Mixed? , 2014, TRETS.

[4]  Valavan Manohararajah,et al.  The Stratix™ 10 Highly Pipelined FPGA Architecture , 2016, FPGA.

[5]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[6]  Thierry Turletti,et al.  A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks , 2014, IEEE Communications Surveys & Tutorials.

[7]  David A. Wood,et al.  A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.

[8]  Jianwen Zhu,et al.  Saturating the transceiver bandwidth: switch fabric design on FPGAs , 2012, FPGA '12.

[9]  Vaughn Betz,et al.  The Case for Embedded Networks on Chip on Field-Programmable Gate Arrays , 2014, IEEE Micro.

[10]  Gordon J. Brebner,et al.  400 Gb/s Programmable Packet Parsing on a Single FPGA , 2011, 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems.

[11]  Vaughn Betz,et al.  Power Analysis of Embedded NoCs on FPGAs and Comparison With Custom Buses , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Derek Chiou,et al.  The Network Processing Forum switch fabric benchmark specifications: an overview , 2005, IEEE Network.

[13]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[14]  Natalie D. Enright Jerger,et al.  Efficient and programmable ethernet switching with a NoC-enhanced FPGA , 2014, 2014 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[15]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[16]  Haoyu Song,et al.  Efficient packet classification for network intrusion detection using FPGA , 2005, FPGA '05.

[17]  Ralf Michael Beuschel Video compression systems for low-latency applications , 2010 .

[18]  Martin Langhammer,et al.  Floating-Point DSP Block Architecture for FPGAs , 2015, FPGA.

[19]  Vaughn Betz,et al.  Bringing programmability to the data plane: Packet processing with a NoC-enhanced FPGA , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[20]  Implementing FPGA Design with the OpenCL Standard , 2010 .

[21]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[22]  Ken Mai,et al.  The future of wires , 2001, Proc. IEEE.