FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review
暂无分享,去创建一个
[1] Quan Chen,et al. DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[2] Patrice Y. Simard,et al. High Performance Convolutional Neural Networks for Document Processing , 2006 .
[3] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[4] S. Winograd. Arithmetic complexity of computations , 1980 .
[5] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[6] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[7] E.A. Lee,et al. Synchronous data flow , 1987, Proceedings of the IEEE.
[8] Jason Cong,et al. High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[9] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.
[10] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[11] Yu Cao,et al. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks , 2017, FPGA.
[12] Vivienne Sze,et al. Hardware for machine learning: Challenges and opportunities , 2017, 2017 IEEE Custom Integrated Circuits Conference (CICC).
[13] William J. Dally,et al. GPUs and the Future of Parallel Computing , 2011, IEEE Micro.
[14] Yun Liang,et al. High-Level Synthesis: Productivity, Performance, and Software Constraints , 2012, J. Electr. Comput. Eng..
[15] Srihari Cadambi,et al. A programmable parallel accelerator for learning and classification , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[16] Viktor Prasanna,et al. Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System , 2017, FPGA.
[17] Mohamed S. Abdelfattah,et al. DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[18] Shijie Li,et al. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks , 2017, ACM Trans. Reconfigurable Technol. Syst..
[19] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.
[20] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[21] Yanjun Qi,et al. Learning to rank with (a lot of) word features , 2010, Information Retrieval.
[22] Karin Strauss,et al. Accelerating Deep Convolutional Neural Networks Using Specialized Hardware , 2015 .
[23] Srihari Cadambi,et al. A Massively Parallel FPGA-Based Coprocessor for Support Vector Machines , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.
[24] Urs A. Muller,et al. A multi-range vision strategy for autonomous offroad navigation , 2007 .
[25] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[26] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[27] David G. Lowe,et al. Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[28] Jason Cong,et al. Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[29] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[30] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[31] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Paris Smaragdis,et al. Bitwise Neural Networks , 2016, ArXiv.
[33] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Christos-Savvas Bouganis,et al. Latency-driven design for FPGA-based convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[35] Henk Corporaal,et al. Memory-centric accelerator design for Convolutional Neural Networks , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[36] Christos-Savvas Bouganis,et al. fpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[37] Gang Hua,et al. A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Michele Magno,et al. Accelerating real-time embedded scene labeling with convolutional networks , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[40] F. Cardells-Tormo,et al. Area-efficient 2D shift-variant convolvers for FPGA-based digital image processing , 2005, International Conference on Field Programmable Logic and Applications, 2005..
[41] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[42] John Freeman,et al. OpenCL for FPGAs: Prototyping a Compiler , 2013 .
[43] Stacy Holman Jones. Torch , 1999 .
[44] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[45] Yann LeCun,et al. An FPGA-based stream processor for embedded real-time vision with Convolutional Networks , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.
[46] Beatriz Blanco-Filgueira,et al. Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications , 2018, IEEE Internet of Things Journal.
[47] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[48] Xiaowei Li,et al. FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[49] Trishul M. Chilimbi,et al. Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.
[50] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[51] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[52] Misha Denil,et al. Predicting Parameters in Deep Learning , 2014 .
[53] Yann LeCun,et al. Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.
[54] Song Han,et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.
[55] Rafael Gadea Gironés,et al. FPGA Implementation of a Pipelined On-Line Backpropagation , 2005, J. VLSI Signal Process..
[56] Shawki Areibi,et al. The Impact of Arithmetic Representation on Implementing MLP-BP on FPGAs: A Study , 2007, IEEE Transactions on Neural Networks.
[57] Jason Cong,et al. Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[58] Berin Martini,et al. Large-Scale FPGA-based Convolutional Networks , 2011 .
[59] Mohamed S. Abdelfattah,et al. Gzip on a chip: high performance lossless data compression on FPGAs using OpenCL , 2014, IWOCL '14.
[60] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[61] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[62] Yann LeCun,et al. Comparing SVM and convolutional networks for epileptic seizure prediction from intracranial EEG , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.
[63] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[64] Xi Chen,et al. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[65] Qiang Chen,et al. Network In Network , 2013, ICLR.
[66] Indranil Saha,et al. journal homepage: www.elsevier.com/locate/neucom , 2022 .
[67] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[68] Yu Cao,et al. Scalable and modularized RTL compilation of Convolutional Neural Networks onto FPGA , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[69] M H Alsuwaiyel. Algorithms: Design Techniques and Analysis (Revised Edition) , 2016 .
[70] Qi Yu,et al. DLAU: A Scalable Deep Learning Accelerator Unit on FPGA , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[71] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[72] Luis Herranz,et al. Heterogeneous Convolutional Neural Networks for Visual Recognition , 2016, PCM.
[73] Yann LeCun,et al. Learning long‐range vision for autonomous off‐road driving , 2009, J. Field Robotics.
[74] Aaftab Munshi,et al. The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).
[75] Jason Helge Anderson,et al. LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.
[76] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[77] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[78] Berin Martini,et al. A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[79] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[80] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[81] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.
[82] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[83] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[84] Douglas L. Maskell,et al. Efficient Overlay Architecture Based on DSP Blocks , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.
[85] Yann LeCun,et al. CNP: An FPGA-based processor for Convolutional Networks , 2009, 2009 International Conference on Field Programmable Logic and Applications.
[86] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[87] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[88] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[89] Srihari Cadambi,et al. A Massively Parallel Digital Learning Processor , 2008, NIPS.
[90] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[91] Hui Zhang,et al. A Multiwindow Partial Buffering Scheme for FPGA-Based 2-D Convolvers , 2007, IEEE Transactions on Circuits and Systems II: Express Briefs.
[92] Tao Zhang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[93] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[94] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[95] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[96] Atsushi Sato,et al. Generalized Learning Vector Quantization , 1995, NIPS.
[97] Paulo J. G. Lisboa,et al. Artificial Neural Networks in Biomedicine , 2000, Perspectives in Neural Computing.
[98] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[99] Ronald Davis,et al. Neural networks and deep learning , 2017 .
[100] Shawki Areibi,et al. Deep Learning on FPGAs: Past, Present, and Future , 2016, ArXiv.
[101] Stefan Wermter,et al. A Multichannel Convolutional Neural Network for Hand Posture Recognition , 2014, ICANN.
[102] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[103] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[104] Jason Cong,et al. Improving high level synthesis optimization opportunity through polyhedral transformations , 2013, FPGA '13.
[105] Shuai Wang,et al. Deep learning for sentiment analysis: A survey , 2018, WIREs Data Mining Knowl. Discov..
[106] Wim Vanderbauwhede,et al. High-Performance Computing Using FPGAs , 2013 .
[107] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[108] Sadiq M. Sait,et al. Iterative computer algorithms with applications in engineering - solving combinatorial optimization problems , 2000 .
[109] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[110] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[111] P. L. Montgomery,et al. A survey of modern integer factorization algorithms , 1994 .
[112] Forrest N. Iandola,et al. SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[113] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[114] Steven Pigeon,et al. VIP: an FPGA-based processor for image processing and neural networks , 1996, Proceedings of Fifth International Conference on Microelectronics for Neural Networks.
[115] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[116] Hadi Esmaeilzadeh,et al. Neural acceleration for GPU throughput processors , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[117] Khaled Benkrid,et al. Design and implementation of a 2D convolution core for video applications on FPGAs , 2002, Third International Workshop on Digital and Computational Video, 2002. DCV 2002. Proceedings..
[118] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2014, IEEE Micro.
[119] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[120] Denis F. Wolf,et al. USING EMBEDDED PROCESSORS IN HARDWARE MODELS OF ARTIFICIAL NEURAL NETWORKS , 2001 .
[121] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[122] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[123] C. Loan. Computational Frameworks for the Fast Fourier Transform , 1992 .
[124] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[125] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[126] Judith E. Dayhoff,et al. Neural Network Architectures: An Introduction , 1989 .
[127] Berin Martini,et al. NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.
[128] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[129] James A. Anderson,et al. Neurocomputing: Foundations of Research , 1988 .
[130] Takashi Morie,et al. Projection-Field-Type VLSI Convolutional Neural Networks Using Merged/Mixed Analog-Digital Approach , 2007, ICONIP.
[131] Alan L. Yuille,et al. Genetic CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[132] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[133] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[134] Geoffrey E. Hinton,et al. Application of Deep Belief Networks for Natural Language Understanding , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[135] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[136] Jason Cong,et al. Polyhedral-based data reuse optimization for configurable computing , 2013, FPGA '13.
[137] L. Bottou,et al. Deep Convolutional Networks for Scene Parsing , 2009 .
[138] Tianshi Chen,et al. DaDianNao: A Neural Network Supercomputer , 2017, IEEE Transactions on Computers.
[139] Larry P. Heck,et al. Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.
[140] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[141] Christoforos E. Kozyrakis,et al. Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.
[142] Gernot A. Fink,et al. Face Detection Using GPU-Based Convolutional Neural Networks , 2009, CAIP.
[143] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[144] Andrew C. Ling,et al. An OpenCL(TM) Deep Learning Accelerator on Arria 10 , 2017 .
[145] Yu Cao,et al. An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[146] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[147] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.
[148] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[149] Martin C. Herbordt,et al. Computing Models for FPGA-Based Accelerators , 2008, Computing in Science & Engineering.
[150] Jagath C. Rajapakse,et al. FPGA Implementations of Neural Networks , 2006 .
[151] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[152] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[153] Andrew C. Ling,et al. An OpenCL™ Deep Learning Accelerator on Arria 10 , 2017, FPGA.
[154] David Gregg,et al. Parallel Multi Channel convolution using General Matrix Multiplication , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[155] Srihari Cadambi,et al. A Massively Parallel Coprocessor for Convolutional Neural Networks , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.
[156] Francisco Cardells-Tormo,et al. Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..
[157] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[158] Ah Chung Tsoi,et al. Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.
[159] T. Morie,et al. An image filtering processor for face/object recognition using merged/mixed analog-digital architecture , 2005, Digest of Technical Papers. 2005 Symposium on VLSI Circuits, 2005..
[160] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[161] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[162] Yu Cao,et al. ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler , 2018, Integr..
[163] M. Balakrishnan,et al. Architecture Exploration of FPGA Based Accelerators for BioInformatics Applications , 2016 .
[164] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[165] Shengen Yan,et al. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[166] C. Reeves. Modern heuristic techniques for combinatorial problems , 1993 .
[167] Yu Cao,et al. Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[168] Michael Ferdman,et al. Maximizing CNN accelerator efficiency through resource partitioning , 2016, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[169] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[170] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[171] Mohamad Ivan Fanany,et al. Metaheuristic Algorithms for Convolution Neural Network , 2016, Comput. Intell. Neurosci..
[172] Yu Wang,et al. Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[173] Jason Cong,et al. Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster , 2016, ISLPED.
[174] Berin Martini,et al. Hardware accelerated convolutional neural networks for synthetic vision systems , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.
[175] Yu Cao,et al. End-to-end scalable FPGA accelerator for deep residual networks , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).
[176] Masanori Hariyama,et al. Design of FPGA-Based Computing Systems with OpenCL , 2017 .
[177] Yvon Savaria,et al. Reconfigurable pipelined 2-D convolvers for fast digital signal processing , 1999, IEEE Trans. Very Large Scale Integr. Syst..
[178] J. Dixon. Asymptotically fast factorization of integers , 1981 .
[179] Xuegong Zhou,et al. A high performance FPGA-based accelerator for large-scale convolutional neural networks , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[180] Mark S. Rzepczynski. Neural Networks in Finance: Gaining Predictive Edge in the Markets (a review) , 2007 .
[181] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[182] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[183] Yann LeCun,et al. A multirange architecture for collision‐free off‐road robot navigation , 2009, J. Field Robotics.
[184] Ilya Kostrikov,et al. PlaNet - Photo Geolocation with Convolutional Neural Networks , 2016, ECCV.
[185] Jun Zhao,et al. Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.
[186] Peng Zhang,et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[187] Wonyong Sung,et al. Resiliency of Deep Neural Networks under Quantization , 2015, ArXiv.
[188] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[189] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[190] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[191] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[192] Yen-Cheng Kuan,et al. A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.
[193] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).