From high-level deep neural models to FPGAs
暂无分享,去创建一个
Asit K. Mishra | Hadi Esmaeilzadeh | Jongse Park | Divya Mahajan | Hardik Sharma | Emmanuel Amaro | Joon Kyung Kim | Chenkai Shao | Asit Mishra | H. Esmaeilzadeh | Jongse Park | Hardik Sharma | Emmanuel Amaro | J. Kim | Divya Mahajan | Chenkai Shao
[1] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[2] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[3] Ronald G. Dreslinski,et al. Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers , 2015, ASPLOS.
[4] Shengen Yan,et al. Deep Image: Scaling up Image Recognition , 2015, ArXiv.
[5] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[7] Quan Chen,et al. DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[8] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Xiaowei Li,et al. C-Brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[10] Dong Han,et al. Cambricon: An Instruction Set Architecture for Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[11] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[12] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[13] Amin Ansari,et al. Bundled execution of recurring traces for energy-efficient general purpose processing , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[14] Steven Swanson,et al. Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.
[15] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[16] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[17] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[18] Berin Martini,et al. A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[19] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[20] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[21] Hadi Esmaeilzadeh,et al. TABLA: A unified template-based framework for accelerating statistical machine learning , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[22] Karthikeyan Sankaralingam,et al. Dynamically Specialized Datapaths for energy efficient computing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[23] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[24] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[25] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[26] John Wawrzynek,et al. High Level Synthesis with a Dataflow Architectural Template , 2016, ArXiv.
[27] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[28] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[29] Berin Martini,et al. NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.
[30] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[31] Babak Falsafi,et al. Toward Dark Silicon in Servers , 2011, IEEE Micro.
[32] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[33] Qiang Chen,et al. Network In Network , 2013, ICLR.
[34] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[35] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[36] Luca Benini,et al. A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[37] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[38] Berin Martini,et al. Hardware accelerated convolutional neural networks for synthetic vision systems , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[39] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[40] Christoforos E. Kozyrakis,et al. Convolution engine: balancing efficiency & flexibility in specialized computing , 2013, ISCA.
[41] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[42] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[43] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[44] Jun-Seok Park,et al. 14.6 A 1.42TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).
[45] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).