Virtualizing network-on-chip resources in chip-multiprocessors

The number of cores on a single silicon chip is rapidly growing and chips containing tens or even hundreds of identical cores are expected in the future. To take advantage of multicore chips, multiple applications will run simultaneously. As a consequence, the traffic interferences between applications increases and the performance of individual applications can be seriously affected. In this paper, we improve the individual application performance when several applications are simultaneously running. This proposal is based on the virtualization concept and allows us to reduce execution time and network latency in a significant percentage.

[1]  P. Sadayappan,et al.  Selective buddy allocation for scheduling parallel jobs on clusters , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[2]  S. Borkar,et al.  An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS , 2008, IEEE Journal of Solid-State Circuits.

[3]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[4]  Amin Vahdat,et al.  Enforcing Performance Isolation Across Virtual Machines in Xen , 2006, Middleware.

[5]  Keqin Li,et al.  A Two-Dimensional Buddy System for Dynamic Resource Allocation in a Partitionable Mesh Connected System , 1991, J. Parallel Distributed Comput..

[6]  Kam-Hoi Cheng,et al.  A two dimensional buddy system for dynamic resource allocation in a partitionable mesh connected system , 1990, CSC '90.

[7]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[8]  Yahui Zhu,et al.  Efficient Processor Allocation Strategie for Mesh-Connected Parallel Computers , 1992, J. Parallel Distributed Comput..

[9]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[10]  Krishan Kumar Paliwal,et al.  Performance Analysis of Guaranteed Throughput and Best Effort Traffic in Network-on-Chip under Different Traffic Scenario , 2009, 2009 International Conference on Future Networks.

[11]  FlichJosé,et al.  Virtualizing network-on-chip resources in chip-multiprocessors , 2011 .

[12]  Christoforos E. Kozyrakis,et al.  From chaos to QoS: case studies in CMP resource management , 2007, CARN.

[13]  Lei Jiang,et al.  Die Stacking (3D) Microarchitecture , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[14]  Mark D. Hill,et al.  Virtual hierarchies to support server consolidation , 2007, ISCA '07.

[15]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[16]  José Duato,et al.  On the Potentials of Segment-Based Routing for NoCs , 2008, 2008 37th International Conference on Parallel Processing.

[17]  Nian-Feng Tzeng,et al.  An efficient submesh allocation strategy for mesh computer systems , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[18]  Vikas Agarwal,et al.  Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[19]  Kees G. W. Goossens,et al.  Networks on silicon: combining best-effort and guaranteed services , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[20]  Kees G. W. Goossens,et al.  Networks on silicon: blessing or nightmare? , 2002, Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools.

[21]  Russell Tessier,et al.  ASOC: a scalable, single-chip communications architecture , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[22]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[23]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[24]  Kai Li,et al.  PARSEC vs. SPLASH-2: A quantitative comparison of two multithreaded benchmark suites on Chip-Multiprocessors , 2008, 2008 IEEE International Symposium on Workload Characterization.

[25]  David Wentzlaff,et al.  Processor: A 64-Core SoC with Mesh Interconnect , 2010 .

[26]  Alberto L. Sangiovanni-Vincentelli,et al.  Addressing the system-on-a-chip interconnect woes through communication-based design , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[27]  Luca Benini,et al.  Networks on chips - technology and tools , 2006, The Morgan Kaufmann series in systems on silicon.

[28]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[29]  Sujit Dey,et al.  An Interconnect Architecture for Networking Systems on Chips , 2002, IEEE Micro.

[30]  Sangyeun Cho,et al.  Managing Distributed, Shared L2 Caches through OS-Level Page Allocation , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[31]  Henry Hoffmann,et al.  On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.

[32]  Luca Benini,et al.  On-Chip Communication Architectures: System on Chip Interconnect , 2008 .

[33]  José Duato,et al.  Logic-Based Distributed Routing for NoCs , 2008, IEEE Computer Architecture Letters.

[34]  Andrew B. Kahng,et al.  ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[35]  José Duato,et al.  On the Potential of NoC Virtualization for Multicore Chips , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.