H3 (Heterogeneity in 3D): A Logic-on-Logic 3D-Stacked Heterogeneous Multi-Core Processor

A single-ISA heterogeneous multi-core processor(HMP) [2], [7] is comprised of multiple core types that all implement the same instruction-set architecture (ISA) but have different microarchitectures. Performance and energy is optimized by migrating a thread's execution among core types as its characteristics change. Simulation-based studies with two core types, one simple (low power) and the other complex (high performance), has shown that being able to switch cores as frequently as once every 1,000 instructions increases energy savings by 50% compared to switching cores once every 10,000 instructions, for the same target performance [10]. These promising results rely on extremely low latencies for thread migration. Here we present the H3 chip that uses 3D die stacking and novel microarchitecture to implement a heterogeneous multi-core processor (HMP) with low-latency fast thread migration capabilities. We discuss details of the H3 design and present power and performance results from running various benchmarks on the chip. The H3 prototype can reduce power consumption of benchmarks by up to 26%.

[1]  Eric Rotenberg,et al.  Rationale for a 3D heterogeneous multi-core processor , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[2]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[3]  Eric Rotenberg,et al.  Under 100-cycle thread migration latency in a single-ISA heterogeneous multi-core processor , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[4]  Srinivas Devadas,et al.  Hardware-level thread migration in a 110-core shared-memory multiprocessor , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).

[5]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[6]  Scott A. Mahlke,et al.  Composite Cores: Pushing Heterogeneity Into a Core , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[7]  Eric Rotenberg,et al.  FabScalar: Composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[8]  Vinesh Srinivasan,et al.  Phase II Implementation and Verification of the H3 Processor. , 2015 .

[9]  Eric Rotenberg,et al.  Physical design of a 3D-stacked heterogeneous multi-core processor , 2016, 2016 IEEE International 3D Systems Integration Conference (3DIC).

[10]  Eric Rotenberg,et al.  Fast register consolidation and migration for heterogeneous multi-core processors , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).