Optimizing N-dimensional, winograd-based convolution for manycore CPUs
暂无分享,去创建一个
Frédo Durand | Zhen Jia | Kai Li | Aleksandar Zlateski | F. Durand | A. Zlateski | Kai Li | Zhen Jia
[1] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[2] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[3] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[4] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Jimeng Sun,et al. An input-adaptive and in-place approach to dense tensor-times-matrix multiply , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Eriko Nurvitadhi,et al. Accelerating Deep Convolutional Networks using low-precision and sparsity , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[8] G. Henry,et al. LIBXSMM: A High Performance Library for Small Matrix Multiplications , 2015 .
[9] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[10] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[11] C. K. Yuen,et al. Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.
[12] Sebastian Scherer,et al. 3D Convolutional Neural Networks for landing zone detection from LiDAR , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[13] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[14] Vijay Madisetti. The Digital Signal Processing Handbook, Second Edition - 3 Volume Set , 2009 .
[15] Yen-Chen Liu,et al. Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.
[16] Endong Wang,et al. Intel Math Kernel Library , 2014 .
[17] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[18] H. Sebastian Seung,et al. ZNNi: Maximizing the Inference Throughput of 3D Convolutional Networks on CPUs and GPUs , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[20] Jeff Johnson,et al. Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.
[21] Yoshua Bengio,et al. Low precision arithmetic for deep learning , 2014, ICLR.
[22] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[23] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[24] Lorenzo Torresani,et al. C3D: Generic Features for Video Analysis , 2014, ArXiv.
[25] Seunghoon Hong,et al. Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Tamara G. Kolda,et al. Tensor Decompositions and Applications , 2009, SIAM Rev..
[28] Kai Li,et al. Full correlation matrix analysis of fMRI data on Intel® Xeon Phi™ coprocessors , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[29] H. Sebastian Seung,et al. Compile-time optimized and statically scheduled N-D convnet primitives for multi-core and many-core (Xeon Phi) CPUs , 2017, ICS '17.
[30] Avinash Sodani,et al. Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition , 2016 .
[31] Vijay K. Madisetti,et al. The Digital Signal Processing Handbook , 1997 .
[32] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Kevin Vincent,et al. On Improving the Numerical Stability of Winograd Convolutions , 2017, ICLR.
[34] Jim Jeffers,et al. Knights Landing overview , 2016 .
[35] Won-Ki Jeong,et al. FusionNet: A Deep Fully Residual Convolutional Neural Network for Image Segmentation in Connectomics , 2016, Frontiers in Computer Science.
[36] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[37] Kunle Olukotun,et al. The Stanford Hydra CMP , 2000, IEEE Micro.
[38] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[39] S. Winograd. Arithmetic complexity of computations , 1980 .
[40] Thomas Brox,et al. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation , 2016, MICCAI.
[41] Nir Shavit,et al. Deep Tensor Convolution on Multicores , 2016, ICML.
[42] H. Sebastian Seung,et al. ZNN -- A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-core and Many-Core Shared Memory Machines , 2015, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[43] Alexander Heinecke,et al. LIBXSMM: Accelerating Small Matrix Multiplications by Runtime Code Generation , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[44] Pramodita Sharma. 2012 , 2013, Les 25 ans de l’OMC: Une rétrospective en photos.
[45] Gianni De Fabritiis,et al. DeepSite: protein‐binding site predictor using 3D‐convolutional neural networks , 2017, Bioinform..
[46] Dennis Gannon,et al. Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..
[47] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[48] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[49] Daniel Thalmann,et al. 3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).