Stochastic-Based Deep Convolutional Networks with Reconfigurable Logic Fabric

Large-scale convolutional neural network is a fundamental algorithmic building block in many computer vision and artificial intelligence applications that follow the deep learning principle. However, a typically-sized CNN is well known to be computationally intensive. This work presents a novel stochastic-based and scalable hardware architecture and circuit design that computes a large-scale CNN with FPGA. The key idea is to implement all key components of a deep learning CNN, including multi-dimensional convolution, activation, and pooling layers, completely in the probabilistic computing domain in order to achieve high computing robustness, high performance, and low hardware usage. Our approach has three advantages. First, it can achieve significantly lower algorithmic complexity for any given accuracy requirement. For a <inline-formula><tex-math notation="LaTeX">$N$</tex-math><alternatives> <inline-graphic xlink:href="alawad-ieq1-2601326.gif"/></alternatives></inline-formula> dimensional image feature map, we have theoretically proven that a random sample size of <inline-formula><tex-math notation="LaTeX"> $k^* \log (N)$</tex-math><alternatives><inline-graphic xlink:href="alawad-ieq2-2601326.gif"/></alternatives> </inline-formula> is sufficient to achieve no more than 0.05 error at 95 percent confidence level, where <inline-formula><tex-math notation="LaTeX">$k^*$</tex-math><alternatives> <inline-graphic xlink:href="alawad-ieq3-2601326.gif"/></alternatives></inline-formula> is a constant of 510. This computing complexity, when compared with that of conventional multiplier-based architecture, represents on average 8.97<inline-formula><tex-math notation="LaTeX">$\times$</tex-math><alternatives> <inline-graphic xlink:href="alawad-ieq4-2601326.gif"/></alternatives></inline-formula> and 6.98<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math><alternatives> <inline-graphic xlink:href="alawad-ieq5-2601326.gif"/></alternatives></inline-formula> performance improvement for SCNN and Deep SCNN, respectively. Second, this proposed stochastic-based architecture is highly fault-tolerant because the information to be processed is encoded with a large ensemble of random samples. As such, the local perturbations of its computing accuracy will be dissipated globally, thus becoming inconsequential to the final overall results. More interestingly, our measured results have shown that 0.1 percent degradation in computing accuracy of CNN can actually mitigate the well-known overfitting problem. Overall, being highly scalable and energy efficient, our stochastic-based convolutional neural network architecture is well-suited for a modular vision engine with the goal of performing real-time detection, recognition, and segmentation of mega-pixel images, especially those perception-based computing tasks that are inherently fault-tolerant, while still requiring high energy efficiency.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Jason Cong,et al.  Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.

[3]  John N. Tsitsiklis,et al.  Introduction to Probability , 2002 .

[4]  Steven K. Thompson,et al.  Sampling: Thompson/Sampling 3E , 2012 .

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[7]  Henk Corporaal,et al.  Memory-centric accelerator design for Convolutional Neural Networks , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[8]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[9]  Yuan He,et al.  Cascaded heterogeneous convolutional neural networks for handwritten digit recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[10]  Berin Martini,et al.  Large-Scale FPGA-based Convolutional Networks , 2011 .

[11]  Luca Maria Gambardella,et al.  Max-pooling convolutional neural networks for vision-based hand gesture recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[12]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[13]  Deepak Khosla,et al.  Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition , 2014, International Journal of Computer Vision.

[14]  Gernot A. Fink,et al.  Face Detection Using GPU-Based Convolutional Neural Networks , 2009, CAIP.

[15]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[16]  Jehoshua Bruck,et al.  Transforming Probabilities With Combinational Logic , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Gerald Penn,et al.  Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Berin Martini,et al.  A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Yann LeCun,et al.  CNP: An FPGA-based processor for Convolutional Networks , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[20]  瑞华 曹,et al.  傅里叶变换及其应用 The Fourier Transform and Its Application , 2014 .

[21]  John Langford,et al.  Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.

[22]  Srihari Cadambi,et al.  A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.

[23]  Yajie Liu,et al.  Offline handwritten English character recognition based on convolutional neural network , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[24]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Brian Cheung,et al.  Convolutional Neural Networks Applied to Human Face Classification , 2012, 2012 11th International Conference on Machine Learning and Applications.

[26]  Sara Weiss The Fourier Transform And Its Application , 2016 .