A domain-specific architecture for deep neural networks
暂无分享,去创建一个
David A. Patterson | Norman P. Jouppi | Cliff Young | Nishant Patil | D. Patterson | N. Jouppi | C. Young | Nishant Patil
[1] Krste Asanovi´c. Programmable Neurocomputing , .
[2] E HintonGeoffrey,et al. ImageNet classification with deep convolutional neural networks , 2017 .
[3] G.E. Moore,et al. No exponential is forever: but "Forever" can be delayed! [semiconductor industry] , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..
[4] Kurt Keutzer. Technical Perspective: If I could only design one circuit … , 2016 .
[5] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[6] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[7] Klaus-Dieter Lange,et al. Identifying Shades of Green: The SPECpower Benchmarks , 2009, Computer.
[8] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[9] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[10] D. Hammerstrom,et al. A VLSI architecture for high-performance, low-cost, on-chip learning , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[11] James E. Smith,et al. Decoupled access/execute computer architectures , 1984, TOCS.
[12] Kurt Keutzer,et al. If I could only design one circuit ...: technical perspective , 2016, Communications of the ACM.
[13] David A. Patterson,et al. Latency Lags Bandwidth , 2005, ICCD.
[14] Luiz André Barroso,et al. The Case for Energy-Proportional Computing , 2007, Computer.
[15] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[16] David A. Patterson,et al. Latency lags bandwith , 2004, CACM.
[17] David A. Patterson,et al. The case for the reduced instruction set computer , 1980, CARN.
[18] Pradeep Dubey,et al. SCALEDEEP: A scalable compute architecture for learning and evaluating deep networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[19] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[20] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[21] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[22] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[23] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[24] Ninghui Sun,et al. DianNao family , 2016, Commun. ACM.
[25] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[26] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[28] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[29] Paolo Ienne,et al. Special-purpose digital hardware for neural networks: An architectural survey , 1996, J. VLSI Signal Process..
[30] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[31] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.