A 92m W 76.8GOPS vector matching processor with parallel Huffman decoder and query re-ordering buffer for real-time object recognition

A vector matching processor with memory bandwidth optimizations is proposed to achieve real-time matching of 128 dimensional SIFT features extracted from VGA video. The main bottleneck of feature-vector matching is the off-chip database access. We employ the locality sensitive hashing (LSH) algorithm which reduces the number of database comparisons required to match each query. In addition, database compression using Huffman coding increases the effective external bandwidth. Dedicated parallel Huffman decoder hardware ensures fast decompression of the database. A flexible query re-ordering buffer exploits overlapping accesses between queries by enabling out-of-order query processing to minimize redundant off-chip access. As a result, the 76.8 GOPS feature matching processor implemented in a 0.13um CMOS process achieves 43200 queries/second on a 100 object database while consuming peak power of 92mW.

[1]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[2]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[3]  Nicole Vincent,et al.  How to Use SIFT Vectors to Analyze an Image with Database Templates , 2007, Adaptive Multimedia Retrieval.

[4]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[5]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Joo-Young Kim,et al.  A 125 GOPS 583 mW Network-on-Chip Based Parallel Processor With Bio-Inspired Visual Attention Engine , 2009, IEEE Journal of Solid-State Circuits.

[7]  Donghyun Kim,et al.  A 201.4GOPS 496mW real-time multi-object recognition processor with bio-inspired neural perception engine , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[8]  Hoi-Jun Yoo,et al.  A 345 mW Heterogeneous Many-Core Processor With an Intelligent Inference Engine for Robust Object Recognition , 2011, IEEE J. Solid State Circuits.

[9]  Donghyun Kim,et al.  A 125GOPS 583mW Network-on-Chip Based Parallel Processor with Bio-inspired Visual-Attention Engine , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[10]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.