Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator
暂无分享,去创建一个
[1] H. T. Kung. Why systolic architectures? , 1982, Computer.
[2] D. Williamson. Dynamically scaled fixed point arithmetic , 1991, [1991] IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings.
[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[4] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[5] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[6] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .
[7] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[8] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[9] Trishul M. Chilimbi,et al. Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.
[10] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[11] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[12] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[13] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[15] Yoshua Bengio,et al. Neural Networks with Few Multiplications , 2015, ICLR.