System-level Evaluation of Chip-Scale Silicon Photonic Networks for Emerging Data-Intensive Applications

Emerging data-driven applications such as graph processing applications are characterized by their excessive memory footprint and abundant parallelism, resulting in high memory bandwidth demand. As the scale of datasets for applications is reaching orders of TBs, performance limitation due to bandwidth demands is a major concern. Traditional on-chip electrical networks fail to meet such high bandwidth demands due to increased energy-per-bit or physical limitations with pin counts. Silicon photonic networks have emerged as a promising alternative to electrical interconnects, owing to their high bandwidth density and low energy-per-bit communication with negligible data-dependent power. Wide-scale adoption of silicon photonics at chip level, however, is hampered by their high sensitivity to process and thermal variations, high laser power due to losses along the network, and power consumption of the electrical-optical conversion. Device-level technological innovations to mitigate these issues are promising, yet they do not consider the system-level implications of the applications running on manycore systems with photonic networks. This work aims to bridge the gap between the system-level attributes of applications with the underlying architectural and device-level characteristics of silicon photonic networks to achieve energy-efficient computing. We particularly focus on graph applications, which involve unstructured yet abundant parallel memory accesses that stress the on-chip communication networks, and develop a cross-layer framework to evaluate 2.5D systems with silicon photonic networks. We demonstrate 38% power savings through system-level management using wavelength selection policies with only 1% loss in system performance and further evaluate architectural design choices on 2.5D systems with photonic networks.

[1]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[2]  P. Dumon,et al.  Silicon microring resonators , 2012 .

[3]  M. Lipson,et al.  Low loss etchless silicon photonic waveguides , 2009, 2009 Conference on Lasers and Electro-Optics and 2009 Conference on Quantum electronics and Laser Science Conference.

[4]  Chao Chen,et al.  Runtime Management of Laser Power in Silicon-Photonic Multibus NoC Architecture , 2013, IEEE Journal of Selected Topics in Quantum Electronics.

[5]  José L. Abellán,et al.  Thermal management of manycore systems with silicon-photonic networks , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Xin Fu,et al.  Aurora: A Cross-Layer Solution for Thermally Resilient Photonic Network-on-Chip , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7]  Joe Macri,et al.  AMD's next generation GPU and high bandwidth memory architecture: FURY , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[8]  Yvain Thonnart,et al.  A 10Gb/s Si-photonic transceiver with 150μW 120μs-lock-time digitally supervised analog microring wavelength stabilization for 1Tb/s/mm2 Die-to-Die Optical Networks , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[9]  Kiyoung Choi,et al.  A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[10]  Zhehui Wang,et al.  CAMON: Low-Cost Silicon Photonic Chiplet for Manycore Processors , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[12]  Gregory A. Fish,et al.  Enabling flexible datacenter interconnect networks with WDM silicon photonics , 2014, Proceedings of the IEEE 2014 Custom Integrated Circuits Conference.

[13]  Yuangang Wang,et al.  Scalable memory fabric for silicon interposer-based multi-core systems , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).

[14]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[15]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[16]  David A. Patterson,et al.  The GAP Benchmark Suite , 2015, ArXiv.

[17]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[18]  S. J. Ben Yoo,et al.  Enabling scalable chiplet-based uniform memory architectures with silicon photonics , 2019, MEMSYS.

[19]  H. Thacker,et al.  Exploiting CMOS Manufacturing to Reduce Tuning Requirements for Resonant Optical Devices , 2011, IEEE Photonics Journal.

[20]  Margaret Martonosi,et al.  Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  José L. Abellán,et al.  Managing Laser Power in Silicon-Photonic NoC Through Cache and NoC Reconfiguration , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Nikolaos Hardavellas,et al.  Galaxy: a high-performance energy-efficient multi-chip architecture using photonic interconnects , 2014, ICS '14.

[23]  Andrew B. Kahng,et al.  Adaptive Tuning of Photonic Devices in a Photonic NoC Through Dynamic Workload Allocation , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Sachin S. Sapatnekar,et al.  Temperature-aware routing in 3D ICs , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[25]  K. Bergman,et al.  Resolving the thermal challenges for silicon microring resonator devices , 2014 .

[26]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[27]  Jie Meng,et al.  Optimizing energy efficiency of 3-D multicore systems with stacked DRAM under power and thermal constraints , 2012, DAC Design Automation Conference 2012.

[28]  Chen Sun,et al.  A 45 nm CMOS-SOI Monolithic Photonics Platform With Bit-Statistics-Based Resonant Microring Thermal Tuning , 2016, IEEE Journal of Solid-State Circuits.

[29]  S. Vetrivel,et al.  APPLICATIONS OF GRAPH THEORY IN COMPUTER SCIENCE AN OVERVIEW , 2010 .

[30]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[31]  David T. Neilson,et al.  Reconfigurable 100 Gb/s silicon photonic network-on-chip , 2014, OFC 2014.

[32]  Yvain Thonnart,et al.  WAVES: Wavelength Selection for Power-Efficient 2.5D-Integrated Photonic NoCs , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[33]  J. Cunningham,et al.  Thermally tunable silicon racetrack resonators with ultralow tuning power. , 2010, Optics express.

[34]  J. Hartmann,et al.  Germanium avalanche receiver for low power interconnects , 2014, Nature Communications.