A 52mW full HD 160-degree object viewpoint recognition SoC with visual vocabulary processor for wearable vision applications

A wearable 1920×1080 160-degree object viewpoint recognition SoC is realized on a 6.38mm2 die with 65nm CMOS technology. This system focuses on enhancing the capability for wide viewpoint and long-distance recognition while reducing the computation of feature matching process. The recognition accuracy is improved from 29% to 94% under full HD resolution for a 50m-far traffic light compared with the performance under VGA (640×480). Object viewpoint prediction (OVP) supports 160-degree object viewpoint differences. 85% of power consumption and 75% of memory bandwidth are reduced via proposed visual vocabulary processor (VVP). 52mW power consumption with 25.9GOPS/mm2 area efficiency is achieved.

[1]  Donghyun Kim,et al.  A 201.4GOPS 496mW real-time multi-object recognition processor with bio-inspired neural perception engine , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[2]  Hoi-Jun Yoo,et al.  A 1.2mW on-line learning mixed mode intelligent inference engine for robust object recognition , 2010, 2010 Symposium on VLSI Circuits.

[3]  George A. Constantinides,et al.  A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  R.P. Kleihorst,et al.  Xetal-II: A 107 GOPS, 600 mW Massively Parallel Processor for Video Scene Analysis , 2008, IEEE Journal of Solid-State Circuits.

[5]  Joo-Young Kim,et al.  A 125 GOPS 583 mW Network-on-Chip Based Parallel Processor With Bio-Inspired Visual Attention Engine , 2009, IEEE Journal of Solid-State Circuits.

[6]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[7]  Hoi-Jun Yoo,et al.  A 345 mW Heterogeneous Many-Core Processor With an Intelligent Inference Engine for Robust Object Recognition , 2011, IEEE J. Solid State Circuits.

[8]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Shao-Yi Chien,et al.  Flexible Hardware Architecture of Hierarchical K-Means Clustering for Large Cluster Number , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[11]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[12]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[13]  Donghyun Kim,et al.  An 81.6 GOPS Object Recognition Processor Based on NoC and Visual Image Processing Memory , 2007, 2007 IEEE Custom Integrated Circuits Conference.

[14]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[15]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[16]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Ryusuke Miyamoto,et al.  Partially Parallel Architecture for AdaBoost-Based Detection With Haar-Like Features , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Pushmeet Kohli,et al.  On Detection of Multiple Object Instances Using Hough Transforms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[20]  Masahiko Yoshimoto,et al.  Fast and Low-Memory-Bandwidth Architecture of SIFT Descriptor Generation with Scalability on Speed and Accuracy for VGA Video , 2010, 2010 International Conference on Field Programmable Logic and Applications.