Fast Convolution Operations on Many-Core Architectures
暂无分享,去创建一个
Lei Shi | Shigang Li | Yunquan Zhang | Chunyang Xiang | Yunquan Zhang | Lei Shi | Shigang Li | Chunyang Xiang
[1] Jeff Johnson,et al. Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.
[2] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[3] Shengen Yan,et al. Deep Image: Scaling up Image Recognition , 2015, ArXiv.
[4] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[5] Martin Cadík,et al. FFT and Convolution Performance in Image Filtering on GPU , 2006, Tenth International Conference on Information Visualisation (IV'06).
[6] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.
[7] Chang-Sung Jeong,et al. Accelerating Multi-scale Image Fusion Algorithms Using CUDA , 2009, 2009 International Conference of Soft Computing and Pattern Recognition.
[8] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[9] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[10] Victor Podlozhnyuk,et al. Image Convolution with CUDA , 2007 .
[11] Jack J. Dongarra,et al. Autotuning GEMM Kernels for the Fermi GPU , 2012, IEEE Transactions on Parallel and Distributed Systems.