Speed-Up of Object Detection Neural Network with GPU
暂无分享,去创建一个
Atsushi Ike | Akira Nakagawa | Koichi Shirahata | Yasumoto Tomita | Satoshi Tanabe | Kyosuke Maeda | Takuya Fukagai
[1] Yeongjae Cheon,et al. PVANet: Lightweight Deep Neural Networks for Real-time Object Detection , 2016, ArXiv.
[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[3] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[5] Pradeep Dubey,et al. Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.
[6] Norbert Luttenberger,et al. A Novel Sorting Algorithm for Many-core Architectures Based on Adaptive Bitonic Sort , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[7] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8] Jeff Johnson,et al. Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.
[9] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.
[12] Vitaly Osipov,et al. GPU sample sort , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[13] Andrew A. Davidson,et al. Efficient parallel merge sort for fixed and variable length keys , 2012, 2012 Innovative Parallel Computing (InPar).
[14] Michael Garland,et al. Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[15] S. Winograd. Arithmetic complexity of computations , 1980 .
[16] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[17] Hirotaka Tamura,et al. Fast algorithm using summed area tables with unified layer performing convolution and average pooling , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[18] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[19] Sergio Guadarrama,et al. Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).