论文信息 - Hardware architecture for large parallel array of Random Feature Extractors applied to image recognition

Hardware architecture for large parallel array of Random Feature Extractors applied to image recognition

We demonstrate a low-power and compact hardware implementation of Random Feature Extractor (RFE) core. With complex tasks like Image Recognition requiring a large set of features, we show how weight reuse technique can allow to virtually expand the random features available from RFE core. Further, we show how to avoid computation cost wasted for propagating "incognizant" or redundant random features. For proof of concept, we validated our approach by using our RFE core as the first stage of Extreme Learning Machine (ELM)--a two layer neural network--and were able to achieve $>97\%$ accuracy on MNIST database of handwritten digits. ELM's first stage of RFE is done on an analog ASIC occupying $5$mm$\times5$mm area in $0.35\mu$m CMOS and consuming $5.95$ $\mu$J/classify while using $\approx 5000$ effective hidden neurons. The ELM second stage consisting of just adders can be implemented as digital circuit with estimated power consumption of $20.9$ nJ/classify. With a total energy consumption of only $5.97$ $\mu$J/classify, this low-power mixed signal ASIC can act as a co-processor in portable electronic gadgets with cameras.

Arindam Basu | Aakash Patil | Shanlan Shen | Enyi Yao

[1] Shaista Hussain,et al. Learning Spike Time Codes Through Morphological Learning With Binary Synapses , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2] Mark D. McDonnell,et al. Fast, Simple and Accurate Handwritten Digit Classification by Training Shallow Neural Network Classifiers with the ‘Extreme Learning Machine’ Algorithm , 2015, PloS one.

[3] Arindam Basu,et al. A 128 channel 290 GMACs/W machine learning based co-processor for intention decoding in brain machine interfaces , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[4] Shaista Hussain,et al. Computation using mismatch: Neuromorphic extreme learning machines , 2013, 2013 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[5] S. Chakrabartty,et al. Sub-Microwatt Analog VLSI Trainable Pattern Classifier , 2007, IEEE Journal of Solid-State Circuits.

[6] Nicolas Pinto,et al. An Evaluation of the Invariance Properties of a Biologically-Inspired System for Unconstrained Face Recognition , 2010, BIONETICS.

[7] Haigang Zhang,et al. An Improved ELM Algorithm Based on PCA Technique , 2015 .

[8] Gert Cauwenberghs,et al. A Multichip Neuromorphic System for Spike-Based Visual Information Processing , 2007, Neural Computation.

[9] Steve B. Furber,et al. The SpiNNaker Project , 2014, Proceedings of the IEEE.

[10] Subhrajit Roy,et al. Liquid State Machine With Dendritically Enhanced Readout for Low-Power, Neuromorphic VLSI Implementations , 2014, IEEE Transactions on Biomedical Circuits and Systems.

[11] Johannes Schemmel,et al. A wafer-scale neuromorphic hardware system for large-scale neural modeling , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[12] Jennifer Hasler,et al. Finding a roadmap to achieve large neuromorphic hardware systems , 2013, Front. Neurosci..

[13] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[14] Kwabena Boahen,et al. Synchrony in Silicon: The Gamma Rhythm , 2007, IEEE Transactions on Neural Networks.

[15] Giacomo Indiveri,et al. Real-Time Classification of Complex Patterns Using Spike-Based Learning in Neuromorphic VLSI , 2009, IEEE Transactions on Biomedical Circuits and Systems.

[16] Xiao-Jing Wang,et al. Mean-Driven and Fluctuation-Driven Persistent Activity in Recurrent Networks , 2007, Neural Computation.

[17] Andrew S. Cassidy,et al. A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[18] Zhenghao Chen,et al. On Random Weights and Unsupervised Feature Learning , 2011, ICML.

[19] Wim Dehaene,et al. A 190mV supply, 10MHz, 90nm CMOS, pipelined sub-threshold adder using variation-resilient circuit techniques , 2011, IEEE Asian Solid-State Circuits Conference 2011.

[20] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22] Arindam Basu,et al. Random projection for spike sorting: Decoding neural signals the neural network way , 2015, 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[23] Steven R. Young,et al. A 1 TOPS/W Analog Deep Machine-Learning Engine With Floating-Gate Storage in 0.13 µm CMOS , 2014, IEEE Journal of Solid-State Circuits.

[24] Hongming Zhou,et al. Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25] Dharmendra S. Modha,et al. Backpropagation for Energy-Efficient Neuromorphic Computing , 2015, NIPS.

[26] Shih-Chii Liu,et al. Minitaur, an Event-Driven FPGA-Based Spiking Network Accelerator , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27] André van Schaik,et al. A neuromorphic hardware framework based on population coding , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[28] P.R. Kinget. Device mismatch and tradeoffs in the design of analog circuits , 2005, IEEE Journal of Solid-State Circuits.

[29] Paolo Gastaldo,et al. Efficient Digital Implementation of Extreme Learning Machines for Classification , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[30] Arindam Basu,et al. Neural Dynamics in Reconfigurable Silicon , 2010, IEEE Transactions on Biomedical Circuits and Systems.

[31] Shaista Hussain,et al. Hardware-Amenable Structural Learning for Spike-Based Pattern Classification Using a Simple Model of Active Dendrites , 2014, Neural Computation.

[32] Rodrigo Alvarez-Icaza,et al. Neurogrid: A Mixed-Analog-Digital Multichip System for Large-Scale Neural Simulations , 2014, Proceedings of the IEEE.

[33] Qiang Chen,et al. Network In Network , 2013, ICLR.

[34] Steve B. Furber,et al. Scalable energy-efficient, low-latency implementations of trained spiking Deep Belief Networks on SpiNNaker , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[35] Hongming Zhou,et al. Silicon spiking neurons for hardware implementation of extreme learning machines , 2013, Neurocomputing.

[36] Arindam Basu,et al. VLSI Extreme Learning Machine: A Design Space Exploration , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[37] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.

[38] Justin Dauwels,et al. A low-power, reconfigurable smart sensor system for EEG acquisition and classification , 2012, 2012 IEEE Asia Pacific Conference on Circuits and Systems.