AI Accelerator Survey and Trends

Over the past several years, new machine learning accelerators were being announced and released every month for a variety of applications from speech recognition, video object detection, assisted driving, and many data center applications. This paper updates the survey of AI accelerators and processors from past two years. This paper collects and summarizes the cur-rent commercial accelerators that have been publicly announced with peak performance and power consumption numbers. The performance and power values are plotted on a scatter graph, and a number of dimensions and observations from the trends on this plot are again discussed and analyzed. This year, we also compile a list of benchmarking performance results and compute the computational efficiency with respect to peak performance.

[1]  Mingguo Zhao,et al.  Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.

[2]  Neil C. Thompson,et al.  The decline of computers as a general purpose technology , 2021, Commun. ACM.

[3]  David Patterson,et al.  A domain-specific supercomputer for training deep neural networks , 2020, Commun. ACM.

[4]  Andreas Olofsson Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip , 2016, ArXiv.

[5]  Jeremy Kepner,et al.  Survey of Machine Learning Accelerators , 2020, 2020 IEEE High Performance Extreme Computing Conference (HPEC).

[6]  Saif Khan,et al.  AI Chips: What They Are and Why They Matter , 2020 .

[7]  Yiran Chen,et al.  A Survey of Accelerator Architectures for Deep Neural Networks , 2020 .

[8]  David A. Patterson,et al.  A domain-specific architecture for deep neural networks , 2018, Commun. ACM.

[9]  Clark S. Lindsey,et al.  Survey of neural network hardware , 1995, SPIE Defense + Commercial Sensing.

[10]  Joel Emer,et al.  Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.

[11]  Andrew S. Cassidy,et al.  Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.

[12]  Peter C. Ma,et al.  Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).

[13]  Debjit Das Sarma,et al.  Compute Solution for Tesla's Full Self-Driving Computer , 2020, IEEE Micro.

[14]  Butler W. Lampson,et al.  There’s plenty of room at the Top: What will drive computer performance after Moore’s law? , 2020, Science.

[15]  Mohamed S. Abdelfattah,et al.  DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[16]  Indranil Saha,et al.  Artiflcial Neural Networks in Hardware: A Survey , 2008 .

[17]  Christoforos E. Kozyrakis,et al.  TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.

[18]  Joel Emer,et al.  Efficient Processing of Deep Neural Networks , 2020, Synthesis Lectures on Computer Architecture.

[19]  John Thompson,et al.  Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[20]  Dhireesha Kudithipudi,et al.  Digital neuromorphic chips for deep learning inference: a comprehensive study , 2019, Optical Engineering + Applications.

[21]  Indranil Saha,et al.  journal homepage: www.elsevier.com/locate/neucom , 2022 .

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  William J. Dally,et al.  Domain-specific hardware accelerators , 2020, Commun. ACM.

[24]  Yin Ma,et al.  Kunlun: A 14nm High-Performance AI Processor for Diversified Workloads , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).

[25]  Cody Coleman,et al.  MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[26]  GROQ ROCKS NEURAL NETWORKS , 2020 .

[27]  Jeremy Kepner,et al.  Survey and Benchmarking of Machine Learning Accelerators , 2019, 2019 IEEE High Performance Extreme Computing Conference (HPEC).

[28]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[29]  Vivienne Sze,et al.  Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2017, IEEE Journal of Solid-State Circuits.

[30]  Eitan Medina,et al.  Habana Labs Purpose-Built AI Inference and Training Processor Architectures: Scaling AI Training Systems Using Standard Ethernet With Gaudi Processor , 2020, IEEE Micro.

[31]  David A. Patterson,et al.  A new golden age for computer architecture , 2019, Commun. ACM.

[32]  Zain-ul-Abdin,et al.  Kickstarting high-performance energy-efficient manycore architectures with Epiphany , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[33]  Glenn Henry,et al.  High-Performance Deep-Learning Coprocessor Integrated into x86 SoC with Server-Class CPUs Industrial Product , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[34]  Wayne Luk,et al.  Deep Neural Network Approximation for Custom Hardware , 2019, ACM Comput. Surv..

[35]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[36]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.

[37]  Bernard Brezzo,et al.  TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[38]  Thomas N. Theis,et al.  The End of Moore's Law: A New Beginning for Information Technology , 2017, Computing in Science & Engineering.

[39]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[40]  Ulrich Rueckert,et al.  Digital Neural Network Accelerators , 2020 .

[41]  Machine Learning Moves to the Edge , 2020 .

[42]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[43]  Jeremy Kepner,et al.  AI Enabling Technologies , 2019 .