AI Accelerator Survey and Trends
暂无分享,去创建一个
Jeremy Kepner | Vijay Gadepally | Siddharth Samsi | Albert Reuther | Peter Michaleas | Michael Jones | V. Gadepally | J. Kepner | A. Reuther | P. Michaleas | Michael Jones | S. Samsi
[1] Mingguo Zhao,et al. Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.
[2] Neil C. Thompson,et al. The decline of computers as a general purpose technology , 2021, Commun. ACM.
[3] David Patterson,et al. A domain-specific supercomputer for training deep neural networks , 2020, Commun. ACM.
[4] Andreas Olofsson. Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip , 2016, ArXiv.
[5] Jeremy Kepner,et al. Survey of Machine Learning Accelerators , 2020, 2020 IEEE High Performance Extreme Computing Conference (HPEC).
[6] Saif Khan,et al. AI Chips: What They Are and Why They Matter , 2020 .
[7] Yiran Chen,et al. A Survey of Accelerator Architectures for Deep Neural Networks , 2020 .
[8] David A. Patterson,et al. A domain-specific architecture for deep neural networks , 2018, Commun. ACM.
[9] Clark S. Lindsey,et al. Survey of neural network hardware , 1995, SPIE Defense + Commercial Sensing.
[10] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[11] Andrew S. Cassidy,et al. Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.
[12] Peter C. Ma,et al. Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[13] Debjit Das Sarma,et al. Compute Solution for Tesla's Full Self-Driving Computer , 2020, IEEE Micro.
[14] Butler W. Lampson,et al. There’s plenty of room at the Top: What will drive computer performance after Moore’s law? , 2020, Science.
[15] Mohamed S. Abdelfattah,et al. DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[16] Indranil Saha,et al. Artiflcial Neural Networks in Hardware: A Survey , 2008 .
[17] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[18] Joel Emer,et al. Efficient Processing of Deep Neural Networks , 2020, Synthesis Lectures on Computer Architecture.
[19] John Thompson,et al. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[20] Dhireesha Kudithipudi,et al. Digital neuromorphic chips for deep learning inference: a comprehensive study , 2019, Optical Engineering + Applications.
[21] Indranil Saha,et al. journal homepage: www.elsevier.com/locate/neucom , 2022 .
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] William J. Dally,et al. Domain-specific hardware accelerators , 2020, Commun. ACM.
[24] Yin Ma,et al. Kunlun: A 14nm High-Performance AI Processor for Diversified Workloads , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[25] Cody Coleman,et al. MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[26] GROQ ROCKS NEURAL NETWORKS , 2020 .
[27] Jeremy Kepner,et al. Survey and Benchmarking of Machine Learning Accelerators , 2019, 2019 IEEE High Performance Extreme Computing Conference (HPEC).
[28] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[29] Vivienne Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2017, IEEE Journal of Solid-State Circuits.
[30] Eitan Medina,et al. Habana Labs Purpose-Built AI Inference and Training Processor Architectures: Scaling AI Training Systems Using Standard Ethernet With Gaudi Processor , 2020, IEEE Micro.
[31] David A. Patterson,et al. A new golden age for computer architecture , 2019, Commun. ACM.
[32] Zain-ul-Abdin,et al. Kickstarting high-performance energy-efficient manycore architectures with Epiphany , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.
[33] Glenn Henry,et al. High-Performance Deep-Learning Coprocessor Integrated into x86 SoC with Server-Class CPUs Industrial Product , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[34] Wayne Luk,et al. Deep Neural Network Approximation for Custom Hardware , 2019, ACM Comput. Surv..
[35] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[36] Berin Martini,et al. NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.
[37] Bernard Brezzo,et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[38] Thomas N. Theis,et al. The End of Moore's Law: A New Beginning for Information Technology , 2017, Computing in Science & Engineering.
[39] Eugenio Culurciello,et al. An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.
[40] Ulrich Rueckert,et al. Digital Neural Network Accelerators , 2020 .
[41] Machine Learning Moves to the Edge , 2020 .
[42] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[43] Jeremy Kepner,et al. AI Enabling Technologies , 2019 .