Convolutional neural network acceleration with hardware/software co-design

Convolutional Neural Networks (CNNs) have a broad range of applications, such as image processing and natural language processing. Inspired by the mammalian visual cortex, CNNs have been shown to achieve impressive results on a number of computer vision challenges, but often with large amounts of processing power and no timing restrictions. This paper presents a design methodology for accelerating CNNs using Hardware/Software Co-design techniques, in order to balance performance and flexibility, particularly for resource-constrained systems. The methodology is applied to a gender recognition case study, using an ARM processor and FPGA fabric to create an embedded system that can process facial images in real-time.

[1]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Marco Cristani,et al.  FPGA-based pedestrian detection under strong distortions , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Xiaoguang Li,et al.  A hardware/software co-design approach for face recognition , 2004, Proceedings. The 16th International Conference on Microelectronics, 2004. ICM 2004..

[4]  Sek M. Chai,et al.  An Embedded Vision Services Framework for Heterogeneous Accelerators , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5]  John N. Lygouras,et al.  Design and evaluation of a hardware/software FPGA-based system for fast image processing , 2008, Microprocess. Microsystems.

[6]  Ji Zheng,et al.  A support vector machine classifier with automatic confidence and its application to gender classification , 2011, Neurocomputing.

[7]  Abdesselam Bouzerdoum,et al.  Hardware/Software Co-design for a Gender Recognition Embedded System , 2016, IEA/AIE.

[8]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[9]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.

[10]  Narayanan Vijaykrishnan,et al.  A Unified Streaming Architecture for Real Time Face Detection and Gender Classification , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[11]  Qiuqi Ruan,et al.  Independent Gabor Analysis of Discriminant Features Fusion for Face Recognition , 2009, IEEE Signal Processing Letters.

[12]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[13]  Ming-Hsuan Yang,et al.  Learning Gender with Support Faces , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Ying Chen,et al.  Design of a hardware/software FPGA-based driver system for a large area high resolution CCD image sensor , 2014 .

[15]  Qian Du,et al.  Gabor-Filtering-Based Nearest Regularized Subspace for Hyperspectral Image Classification , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[16]  Jürgen Teich,et al.  Hardware/Software Codesign: The Past, the Present, and Predicting the Future , 2012, Proceedings of the IEEE.

[17]  Caifeng Shan,et al.  Learning local binary patterns for gender classification on real-world face images , 2012, Pattern Recognit. Lett..

[18]  K. Naka,et al.  S‐potentials from colour units in the retina of fish (Cyprinidae) , 1966, The Journal of physiology.

[19]  Vedat Tavsanoglu,et al.  On an Improved FPGA Implementation of CNN-Based Gabor-Type Filters , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[20]  YiDing Wang,et al.  Improving generalization for gender classification , 2008, 2008 15th IEEE International Conference on Image Processing.

[21]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[22]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[23]  Abdesselam Bouzerdoum,et al.  A Gender Recognition System using Shunting Inhibitory Convolutional Neural Networks , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[24]  D. Sagi,et al.  Gabor filters as texture discriminator , 1989, Biological Cybernetics.

[25]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[26]  Namik Kemal Saritekin,et al.  A Data Path Design Tool for Automatically Mapping Artificial Neural Networks on to FPGA-Based Systems , 2016 .

[27]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[28]  Neil Davey,et al.  The role of global and feature based information in gender classification of faces: a comparison of human performance and computational models , 2005, Int. J. Neural Syst..

[29]  Shumeet Baluja,et al.  Boosting Sex Identification Performance , 2005, International Journal of Computer Vision.

[30]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[31]  Pengfei Shi,et al.  A novel fusion-based method for expression-invariant gender classification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[33]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Yi-Ping Hung,et al.  Automatic Gender Recognition Using Fusion of Facial Strips , 2010, 2010 20th International Conference on Pattern Recognition.

[35]  J. Daugman Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[36]  Yann LeCun,et al.  CNP: An FPGA-based processor for Convolutional Networks , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[37]  Rajesh Gupta,et al.  Hardware/software co-design , 1996, Proc. IEEE.

[38]  Eduardo Ros,et al.  A Comparison of FPGA and GPU for Real-Time Phase-Based Optical Flow, Stereo, and Local Image Features , 2012, IEEE Transactions on Computers.

[39]  Bok-Min Goi,et al.  Recognizing Human Gender in Computer Vision: A Survey , 2012, PRICAI.

[40]  E. Culurciello,et al.  NeuFlow: Dataflow vision processing system-on-a-chip , 2012, 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS).

[41]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[42]  Walter Stechele,et al.  Hardware/software architecture of an algorithm for vision-based real-time vehicle detection in dark environments , 2008, 2008 Design, Automation and Test in Europe.

[43]  Sek M. Chai,et al.  FPGA acceleration for feature based processing applications , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Ming Che,et al.  A Hardware/Software Co-design of a Face Detection Algorithm Based on FPGA , 2010, 2010 International Conference on Measuring Technology and Mechatronics Automation.

[45]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[46]  Aniket Ratnakar REAL TIME GENDER RECOGNITION ON FPGA , 2015 .

[47]  Narayanan Vijaykrishnan,et al.  A Hardware Efficient Support Vector Machine Architecture for FPGA , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[48]  C. Thomaz,et al.  A new ranking method for principal components analysis and its application to face image analysis , 2010, Image Vis. Comput..

[49]  Luca Benini,et al.  Origami: A Convolutional Network Accelerator , 2015, ACM Great Lakes Symposium on VLSI.

[50]  Abdesselam Bouzerdoum,et al.  Adaptive hierarchical architecture for visual recognition. , 2010, Applied optics.

[51]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[52]  Srihari Cadambi,et al.  A Massively Parallel Coprocessor for Convolutional Neural Networks , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.

[53]  W. James MacLean,et al.  An Evaluation of the Suitability of FPGAs for Embedded Vision Systems , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[54]  Robert Laganière,et al.  Real-time embedded age and gender classification in unconstrained video , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).