A versatile recognition processor employing Haar-like feature and cascaded classifier

This paper presents a versatile recognition processor that performs detection and recognition of image, video, sound and acceleration signals, while dissipating 0.15µW/fps to 0.47mW/fps (Fig. 8.2.1). Given the low power dissipation of sub-mW/fps, this processor is suitable for use in portable electronics and wireless sensor networks (WSN) [1]. For instance, it detects human faces from a QVGA image with 81% accuracy and consumes 0.47mW/fps. Power consumption is 57× lower than that of conventional object recognition processors [2, 3] with comparable accuracy (Fig. 8.2.2). A fair comparison, by taking technology differences into account, shows greater than 8× power efficiency. This processor detects speech from very short and low quality sound signals (72ms in 10s, 8kHz, 8b) recorded by a microphone in a sensor node. It also recognizes human activities such as walking, reading and typing from short and low quality 3D acceleration signals (2s in 10s, 50Hz, 8b) taken by an accelerometer. Recognition accuracy is over 90% in both applications. The versatility and low-power dissipation are attributed to optimal VLSI design from algorithm to architecture and circuit levels.

[1]  Donghyun Kim,et al.  A 125GOPS 583mW Network-on-Chip Based Parallel Processor with Bio-inspired Visual-Attention Engine , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[2]  Liang-Gee Chen,et al.  iVisual: An intelligent visual sensor SoC with 2790fps CMOS image sensor and 205GOPS/W vision processor , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[3]  Tadahiro Kuroda,et al.  Speech "Siglet" Detection for Business Microscope (concise contribution) , 2008, 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom).

[4]  T. Kuroda,et al.  Low cost speech detection using Haar-like filtering for sensornet , 2008, 2008 9th International Conference on Signal Processing.

[5]  Joo-Young Kim,et al.  A 125 GOPS 583 mW Network-on-Chip Based Parallel Processor With Bio-Inspired Visual Attention Engine , 2009, IEEE Journal of Solid-State Circuits.

[6]  Tadahiro Kuroda,et al.  Haar-Like Filtering for Human Activity Recognition Using 3D Accelerometer , 2009, 2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop.

[7]  Tadahiro Kuroda,et al.  Speaker Siglet Detection for Business Microscope , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[8]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.