Adaptive Image Sensor Sampling for Limited Memory Motion Detection

In this paper we propose that the combination of a state-of-the-art high frequency, low energy demanding microprocessor architecture combined with a highly programmable image sensor can offer a substantial reduction in cost and energy requirement when carrying out low-level visual event detection and object tracking. The XMOS microprocessor consists of a single or multi-core concurrent architecture that runs at between 400 and 1600 MIPS with 64KB per-core of on chip RAM. Modern highly programmable image sensors such as the Kodak KAC-401 can capture regions-of-interest (ROI) at rates in excess of 1500fps. To compare the difference between two 320 by 240 pixel images one would usually require 150KB of RAM, by combining the above components as a computational camera this constraint can be overcome. In the proposed system the microprocessor programs the sensor to capture images as a sequence of high frame rate regions-of-interest. These regions can be processed to determine the presence of motion as differences of ROIs over time. By providing additional cores extensive image processing can be carried out and ROI pixels can be composited onto an LCD to give output images of 320 by 240 pixels at near standard frame rates.

[1]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[2]  Stephan Schraml,et al.  A spatio-temporal clustering method using real-time motion analysis on event-based 3D vision , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[3]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[5]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[6]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[7]  William M. Wells,et al.  Efficient Synthesis of Gaussian Filters by Cascaded Uniform Filters , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.