论文信息 - Energy-efficient VFI-partitioned multicore design using wireless NoC architectures

Energy-efficient VFI-partitioned multicore design using wireless NoC architectures

In recent years, multiple Voltage Frequency Island (VFI)-based designs have increasingly made their way into both commercial and research multicore platforms. On the other hand, the wireless Network-on-Chip (WiNoC) architecture has emerged as an energy-efficient and high bandwidth communication backbone for massively integrated multicore platforms. It becomes therefore possible to exploit the small-world effects induced by the wireless links of a WiNoC to achieve efficient inter-VFI data exchanges. In this work, we demonstrate that WiNoCs can provide better latency and energy profiles compared to traditional mesh-like architecture for VFI-partitioned multicore designs. The performance gains and energy efficiency are achieved due to the low-power wireless shortcuts in conjunction with the small-world architecture. Indeed, our experimental results show energy improvements as large as 40% for multithreaded application benchmarks.

[1] Jonathan Chang,et al. A 45 nm 8-Core Enterprise Xeon¯ Processor , 2010, IEEE J. Solid State Circuits.

[2] T. Petermann,et al. Spatial small-world networks: A wiring-cost perspective , 2005, cond-mat/0501420.

[3] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .

[4] Partha Pratim Pande,et al. Performance evaluation of wireless NoCs in presence of irregular network routing strategies , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5] Partha Pratim Pande,et al. Complex network-enabled robust wireless network-on-chip architectures , 2013, JETC.

[6] Gaurav Mittal,et al. Design of the Power6 Microprocessor , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[7] Partha Pratim Pande,et al. Performance evaluation and design trade-offs for network-on-chip interconnect architectures , 2005, IEEE Transactions on Computers.

[8] Radu Marculescu,et al. Communication architecture optimization: making the shortest path shorter in regular networks-on-chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[9] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.

[10] Partha Pratim Pande,et al. Wireless NoC as Interconnection Backbone for Multicore Chips: Promises and Challenges , 2012, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[11] M. K. Gowan,et al. A 65 nm 2-Billion Transistor Quad-Core Itanium Processor , 2009, IEEE Journal of Solid-State Circuits.

[12] H. Mair,et al. A 65-nm Mobile Multimedia Applications Processor with an Adaptive Power Management Scheme to Compensate for Variations , 2007, 2007 IEEE Symposium on VLSI Circuits.

[13] Olav Lysne,et al. Topology Agnostic Dynamic Quick Reconfiguration for Large-Scale Interconnection Networks , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[14] Partha Pratim Pande,et al. Energy-efficient multicore chip design through cross-layer approach , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15] Partha Pratim Pande,et al. Design of an Energy-Efficient CMOS-Compatible NoC Architecture with Millimeter-Wave Wireless Interconnects , 2013, IEEE Transactions on Computers.

[16] Chih-Ming Hung,et al. Intra-chip wireless interconnect for clock distribution implemented with integrated antennas, receivers, and transmitters , 2002, IEEE J. Solid State Circuits.

[17] Stefan Rusu,et al. A 45nm 8-core enterprise Xeon ® processor , 2009 .

[18] Niraj K. Jha,et al. Token flow control , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[19] Olav Lysne,et al. Layered routing in irregular networks , 2006, IEEE Transactions on Parallel and Distributed Systems.

[20] M. K. Gowan,et al. A 65nm 2-Billion-Transistor Quad-Core Itanium® Processor , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[21] Radu Marculescu,et al. Custom Feedback control: Enabling truly scalable on-chip power management for MPSoCs , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[22] Steven M. Nowick,et al. A low-latency FIFO for mixed-clock systems , 2000, Proceedings IEEE Computer Society Workshop on VLSI 2000. System Design for a System-on-Chip Era.

[23] Siddharth Garg,et al. Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[24] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[25] Radu Marculescu,et al. "It's a small world after all": NoC performance optimization via long-range link insertion , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26] Niraj K. Jha,et al. Express virtual channels: towards the ideal interconnection fabric , 2007, ISCA '07.

[27] Jean-Michel Chabloz,et al. Globally-Ratiochronous, Locally-Synchronous Systems , 2012 .

[28] Daniel Marcos Chapiro,et al. Globally-asynchronous locally-synchronous systems , 1985 .