Special session paper 3D nanosystems enable embedded abundant-data computing

The world’s appetite for abundant-data computing, where a massive amount of structured and unstructured data is analyzed, has increased dramatically. The computational demands of these applications, such as deep learning, far exceed the capabilities of today’s systems, especially for energy-constrained embedded systems (e.g., mobile systems with limited battery capacity). These demands are unlikely to be met by isolated improvements in transistor or memory technologies, or integrated circuit (IC) architectures alone. Transformative nanosystems, which leverage the unique properties of emerging nanotechnologies to create new IC architectures, are required to deliver unprecedented functionality, performance, and energy efficiency. We show that the projected energy efficiency benefits of domain-specific 3D nanosystems is in the range of 1,000x (quantified using the product of system-level energy consumption and execution time) over today's domain-specific 2D systems with off-chip DRAM. Such a drastic improvement is key to enabling new capabilities such as deep learning in embedded systems.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  David Atienza,et al.  3D-ICE: A Compact Thermal Model for Early-Stage Design of Liquid-Cooled ICs , 2014, IEEE Transactions on Computers.

[3]  Jon Anderson,et al.  A figure of merit for mobile device thermal management , 2016, 2016 15th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm).

[4]  Hai Wei,et al.  Carbon Nanotube Robust Digital VLSI , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[6]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Hai Wei,et al.  Rapid Co-Optimization of Processing and Circuit Design to Overcome Carbon Nanotube Variations , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  H-S Philip Wong,et al.  Memory leads the way to better computing. , 2015, Nature nanotechnology.

[11]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.

[12]  Christoforos E. Kozyrakis,et al.  TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.

[13]  Shimeng Yu,et al.  HfOx based vertical resistive random access memory for cost-effective 3D cross-point architecture without cell selector , 2012, 2012 International Electron Devices Meeting.

[14]  H.-S. Philip Wong,et al.  Nano-engineered architectures for ultra-low power wireless body sensor nodes , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[15]  Kunle Olukotun,et al.  Understanding and optimizing asynchronous low-precision stochastic gradient descent , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[16]  Subhasish Mitra,et al.  Three-dimensional integration of nanotechnologies for computing and data storage on a single chip , 2017, Nature.

[17]  Duane Mills,et al.  19.7 A 16Gb ReRAM with 200MB/s write and 1GB/s read in 27nm technology , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[18]  Kunle Olukotun,et al.  Energy-Efficient Abundant-Data Computing: The N3XT 1,000x , 2015, Computer.

[19]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).