A 345 mW Heterogeneous Many-Core Processor With an Intelligent Inference Engine for Robust Object Recognition

Fast and robust object recognition of cluttered scenes presents two main challenges: (1) the large number of features to process requires high computational power, and (2) false matches from background clutter can degrade recognition accuracy. Previously, saliency based bottom-up visual attention [1,2] increased recognition speed by confining the recognition processing only to the salient regions. But these schemes had an inherent problem: the accuracy of the attention itself. If attention is paid to the false region, which is common when saliency cannot distinguish between clutter and object, recognition accuracy is degraded. In order to improve the attention accuracy, we previously reported an algorithm, the Unified Visual Attention Model (UVAM) [3], which incorporates the familiarity map on top of the saliency map for the search of attentive points. It can cross-check the accuracy of attention deployment by combining top-down attention, searching for “meaningful objects”, and bottom-up attention, just looking for conspicuous points. This paper presents a heterogeneous many-core (note: we use the term “many-core” instead of “multi-core” to emphasize the large number of cores) processor that realizes the UVAM algorithm to achieve fast and robust object recognition of cluttered video sequences.

[1]  Dieter Schmalstieg,et al.  Real-Time Detection and Tracking for Augmented Reality on Mobile Phones , 2010, IEEE Transactions on Visualization and Computer Graphics.

[2]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[3]  Marwan A. Jabri,et al.  Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks , 1992, IEEE Trans. Neural Networks.

[4]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[5]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[6]  Joo-Young Kim,et al.  The brain mimicking Visual Attention Engine: An 80×60 digital Cellular Neural Network for rapid global feature extraction , 2008, 2008 IEEE Symposium on VLSI Circuits.

[7]  A. R. Newton,et al.  Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas , 1990 .

[8]  Donghyun Kim,et al.  A 125GOPS 583mW Network-on-Chip Based Parallel Processor with Bio-inspired Visual-Attention Engine , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[9]  Hoi-Jun Yoo,et al.  Low-power NoC for high-performance SoC design , 2008 .

[10]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[11]  Hoi-Jun Yoo,et al.  A 1.2mW on-line learning mixed mode intelligent inference engine for robust object recognition , 2010, 2010 Symposium on VLSI Circuits.

[12]  Joo-Young Kim,et al.  A 125 GOPS 583 mW Network-on-Chip Based Parallel Processor With Bio-Inspired Visual Attention Engine , 2009, IEEE Journal of Solid-State Circuits.

[13]  Hoi-Jun Yoo,et al.  Familiarity based unified visual attention model for fast and robust object recognition , 2010, Pattern Recognit..

[14]  Donghyun Kim,et al.  An 81.6 GOPS Object Recognition Processor Based on NoC and Visual Image Processing Memory , 2007, 2007 IEEE Custom Integrated Circuits Conference.

[15]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[16]  William J. Dally,et al.  A Programmable 512 GOPS Stream Processor for Signal, Image, and Video Processing , 2008, IEEE J. Solid State Circuits.

[17]  Joo-Young Kim,et al.  A 118.4GB/s multi-casting network-on-chip for real-time object recognition processor , 2009, 2009 Proceedings of ESSCIRC.