Back Suction: Service Guarantees for Latency-Sensitive On-chip Networks

Networks-on-chip for future many-core processor platforms face an increasing diversity of traffic requirements, ranging from streaming traffic with real-time requirements to bursty latency-sensitive best-effort traffic from general-purpose processors with caches. In this paper, we propose Back Suction, a novel flow-control scheme to implement quality-of-service. Traffic with service guarantees is selectively prioritized upon low buffer occupancy of downstream routers. As a result, best-effort traffic is preferred for an improved latency as long as guaranteed service traffic makes sufficient progress. We present a formal analysis and an experimental evaluation of the Back Suction scheme showing improved latency of best effort traffic when compared to current approaches even under formal service guarantees for streaming traffic.

[1]  Ran Ginosar,et al.  QNoC: QoS architecture and design process for network on chip , 2004, J. Syst. Archit..

[2]  Jens Sparsø,et al.  The MANGO clockless network-on-chip: Concepts and implementation , 2006 .

[3]  Axel Jantsch,et al.  Guaranteed bandwidth using looped containers in temporally disjoint networks within the nostrum network on chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[4]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[5]  T. M. Marescaux,et al.  Mapping and management of communication services on MP-SoC platforms , 2007 .

[6]  Fabrizio Petrini,et al.  Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.

[7]  Henry Hoffmann,et al.  On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.

[8]  Niraj K. Jha,et al.  Token flow control , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[9]  Krste Asanovic,et al.  Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks , 2008, 2008 International Symposium on Computer Architecture.

[10]  Rolf Ernst,et al.  Efficient throughput-guarantees for latency-sensitive networks-on-chip , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[11]  Rolf Ernst,et al.  Reliable performance analysis of a multicore multithreaded system-on-chip , 2008, CODES+ISSS '08.

[12]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[13]  Rolf Ernst,et al.  Application development with the FlexWAFE real-time stream processing architecture for FPGAs , 2009, TECS.

[14]  Jörg Henkel,et al.  Bounded arbitration algorithm for QoS-supported on-chip communication , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[15]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[16]  Jean-Yves Le Boudec,et al.  Network Calculus: A Theory of Deterministic Queuing Systems for the Internet , 2001 .

[17]  Rajeev Balasubramonian,et al.  Interconnect design considerations for large NUCA caches , 2007, ISCA '07.

[18]  Rolf Ernst,et al.  System Level Performance Analysis for Real-Time Multi-Core and Network Architectures , 2012, Advances in Real-Time Systems.