AnyCore: A synthesizable RTL model for exploring and fabricating adaptive superscalar cores

Adaptive superscalar cores have the ability to dynamically adjust their execution resources to match the instruction-level parallelism (ILP) of different program phases. The goal of adaptivity is to maximize performance in as energy-efficient a manner as possible. This is achieved by disabling execution resources that contribute only marginally to performance for the code at hand. Researchers have proposed many adaptive features, including structures, superscalar width, and pipeline depth. The benefits of adaptivity are eroded by its circuit-level overheads. Unfortunately, circuit-level overheads cannot be effectively estimated or appreciated without a hardware design. To this end, we developed a register-transfer-level (RTL) design of a highly adaptive superscalar core, called AnyCore. AnyCore can be used to quantify logic overheads of an adaptive core with respect to fixed cores, synthesize and compare different adaptive cores, synthesize and compare an adaptive core to a multi-core comprised of multiple fixed core types, and fabricate adaptive superscalar cores. We provide examples of these use-cases.

[1]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[2]  Engin Ipek,et al.  A Reconfigurable Chip Multiprocessor Architecture to Accommodate Software Diversity , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[3]  Pradip Bose,et al.  Energy efficient co-adaptive instruction fetch and issue , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[4]  Shyamkumar Thoziyoor,et al.  CACTI 5 . 1 , 2008 .

[5]  Eric Rotenberg,et al.  A case for dynamic pipeline scaling , 2002, CASES '02.

[6]  Kevin Skadron,et al.  Federation: Repurposing scalar cores for out-of-order instruction issue , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[7]  William Rhett Davis,et al.  FreePDK15: An Open-Source Predictive Process Design Kit for 15nm FinFET Technology , 2015, ISPD.

[8]  S. Winkel Optimal versus Heuristic Global Code Scheduling , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[9]  Dean M. Tullsen,et al.  Reducing peak power with a table-driven adaptive processor core , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[10]  Gürhan Küçük,et al.  Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources , 2001, MICRO.

[11]  Michael C. Huang,et al.  Dynamically Tuning Processor Resources with Adaptive Processing , 2003, Computer.

[12]  Michael L. Scott,et al.  Integrating adaptive on-chip storage structures for reduced dynamic power , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[13]  Prakash Shyamlal Ramrakhyani,et al.  Dynamic Pipeline Scaling , 2003 .

[14]  Tejas Karkhanis,et al.  Energy efficient co-adaptive instruction fetch and issue , 2003, ISCA '03.

[15]  Eric Rotenberg,et al.  FabScalar: Composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[16]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[17]  Milos D. Ercegovac,et al.  The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[18]  Paul D. Franzon,et al.  FreePDK: An Open-Source Variation-Aware Design Kit , 2007, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07).

[19]  Christine A. Shoemaker,et al.  Flicker: a dynamically adaptive architecture for power limited multicore systems , 2013, ISCA.

[20]  Norman P. Jouppi,et al.  Core architecture optimization for heterogeneous chip multiprocessors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[21]  Norman P. Jouppi,et al.  Single-ISA heterogeneous multi-core architectures for multithreaded workload performance , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[22]  Norman P. Jouppi,et al.  Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.

[23]  Tanmay Shah FabMem: A Multiported RAM and CAM Compiler for Superscalar Design Space Exploration. , 2010 .

[24]  Tulika Mitra,et al.  Bahurupi: A polymorphic heterogeneous multi-core architecture , 2012, TACO.

[25]  Yale N. Patt,et al.  MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[26]  Rajeev Balasubramonian,et al.  Reducing the complexity of the register file in dynamic superscalar processors , 2001, MICRO.

[27]  Srilatha Manne,et al.  Power and energy reduction via pipeline balancing , 2001, ISCA 2001.

[28]  Zeshan Chishti,et al.  Shapeshifter: Dynamically changing pipeline width and speed to address process variations , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.