Energy Efficient Network-on-Chip Architectures for Many-Core Near-Threshold Computing System

Near threshold computing has unraveled a promising design space for energy efficient computing. However, it is still plagued by sub-optimal system performance. Application characteristics and hardware non-idealities of conventional architectures (those optimized for nominal voltage) prevent us from fully leveraging the potential of NTC systems. Increasing the computational core count still forms the bedrock of a multitude of contemporary works that address the problem of performance degradation in NTC systems. However, these works do not categorically address the shortcomings of the conventional on-chip interconnect fabric in a many core environment. In this work, we quantitatively demonstrate the performance bottleneck created by a conventional NTC architecture in many-core NTC systems. To reclaim the performance lost due to a sub-optimal NoC in many-core NTC systems, we propose BoostNoC—a power efficient, multi-layered network-on-chip architecture. BoostNoC improves the system performance by nearly 2× over a conventional NTC system, while largely sustaining its energy benefits. Further, capitalizing on the application characteristics, we propose two BoostNoC derivative designs: (i) PG BoostNoC; and (ii) Drowsy BoostNoC; to improve the energy efficiency by 1.4× and 1.37×, respectively over conventional NTC system.

[1]  Kaushik Roy,et al.  Device optimization for ultra-low power digital sub-threshold operation , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[2]  Josep Torrellas,et al.  EnergySmart: Toward energy-efficient manycores for Near-Threshold Computing , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[3]  Chen Sun,et al.  DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[4]  Sudhir K. Satpathy,et al.  Catnap: energy proportional multiple network-on-chip , 2013, ISCA.

[5]  Shaahin Hessabi,et al.  TooT: an efficient and scalable power-gating method for NoC routers , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).

[6]  Ahmad Khademzadeh,et al.  Onyx: A new heuristic bandwidth-constrained mapping of cores onto tile-based Network on Chip , 2009, IEICE Electron. Express.

[7]  Lizhong Chen,et al.  NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[8]  Radu Marculescu,et al.  QuaLe: A Quantum-Leap Inspired Model for Non-stationary Analysis of NoC Traffic in Chip Multi-processors , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.

[9]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[10]  Patrick Chiang,et al.  Synctium: a Near-Threshold Stream Processor for Energy-Constrained Parallel Applications , 2010, IEEE Computer Architecture Letters.

[11]  Hu Chen,et al.  Opportunistic Turbo Execution in NTC: Exploiting the paradigm shift in performance bottlenecks , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.

[13]  T. Mudge,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[14]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[15]  Chidhambaranathan Rajamanikkam,et al.  BoostNoC: Power efficient network-on-chip architecture for near threshold computing , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[16]  Sparsh Mittal,et al.  A Survey of Architectural Techniques for Near-Threshold Computing , 2015, ACM J. Emerg. Technol. Comput. Syst..

[17]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[18]  David Blaauw,et al.  Assessing the performance limits of parallelized near-threshold computing , 2012, DAC Design Automation Conference 2012.

[19]  Ren Wang,et al.  Energy-efficient interconnect via Router Parking , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[20]  Andreas Herkersdorf,et al.  Hierarchical NoCs for Optimized Access to Shared Memory and IO Resources , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[21]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[22]  Mohammad Eshghi,et al.  Clustered NOC, a suitable design for group communications in Network on Chip , 2012, Comput. Electr. Eng..

[23]  Hideharu Amano,et al.  Run-time power gating of on-chip routers using look-ahead routing , 2008, 2008 Asia and South Pacific Design Automation Conference.

[24]  Josep Torrellas,et al.  VARIUS-NTV: A microarchitectural model to capture the increased sensitivity of manycores to process variations at near-threshold voltages , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[25]  S. Mukhopadhyay,et al.  Variation Tolerant Memories in sub-90 nm Technologies , 2006 .

[26]  J. Torrellas,et al.  VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects , 2008, IEEE Transactions on Semiconductor Manufacturing.

[27]  Radu Marculescu,et al.  Workload characterization and its impact on multicore platform design , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).