Velocity adaptation of spatio-temporal receptive fields for direct recognition of activities: an experimental study

This article presents an experimental study of the influence of velocity adaptation when recognizing spatio-temporal patterns using a histogram-based statistical framework. The basic idea consists of adapting the shapes of the filter kernels to the local direction of motion, so as to allow the computation of image descriptors that are invariant to the relative motion in the image plane between the camera and the objects or events that are studied. Based on a framework of recursive spatio-temporal scale-space, we first outline how a straightforward mechanism for local velocity adaptation can be expressed. Then, for a test problem of recognizing activities, we present an experimental evaluation, which shows the advantages of using velocity-adapted spatio-temporal receptive fields, compared to directional derivatives or regular partial derivatives for which the filter kernels have not been adapted to the local image motion.

[1]  Tony Lindeberg,et al.  Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection , 2000, IEEE Trans. Image Process..

[2]  James L. Crowley,et al.  Object Recognition Using Coloured Receptive Fields , 2000, ECCV.

[3]  T. Lindeberg Scale-space with Causal Time Direction , 1996 .

[4]  J. J. Koenderink,et al.  Scale-time , 1988, Biological Cybernetics.

[5]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[6]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[7]  James L. Crowley,et al.  Local Scale Selection for Gaussian Based Description Techniques , 2000, ECCV.

[8]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[9]  Tony Lindeberg,et al.  Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure , 1994, ECCV.

[10]  Tony Lindeberg,et al.  Scale-Space with Casual Time Direction , 1996, ECCV.

[11]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  J. Koenderink The structure of images , 2004, Biological Cybernetics.

[13]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Luc Florack,et al.  The Intrinsic Structure of Optic Flow Incorporating Measurement Duality , 1998, International Journal of Computer Vision.

[15]  Max A. Viergever,et al.  Scale-Space Theory in Computer Vision , 1997 .

[16]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[17]  Luc Florack,et al.  Image Structure , 1997, Computational Imaging and Vision.

[18]  Frédéric Guichard,et al.  A morphological, affine, and Galilean invariant scale-space for movies , 1998, IEEE Trans. Image Process..

[19]  T. Lindeberg,et al.  Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure , 1997 .

[20]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .

[21]  Michael J. Black Recursive Non-Linear Estimation of Discontinuous Flow Fields , 1994, ECCV.

[22]  Manuel González,et al.  Affine Invariant Texture Segmentation and Shape from Texture by Variational Methods , 1998, Journal of Mathematical Imaging and Vision.

[23]  Tony Lindeberg,et al.  Time-Recursive Velocity-Adapted Spatio-Temporal Scale-Space Filters , 2002, ECCV.

[24]  Andrew Zisserman,et al.  Viewpoint invariant texture matching and wide baseline stereo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[25]  P. Anandan,et al.  Mosaic based representations of video sequences and their applications , 1995, Proceedings of IEEE International Conference on Computer Vision.

[26]  I. Ohzawa,et al.  Receptive-field dynamics in the central visual pathways , 1995, Trends in Neurosciences.

[27]  Gerald Sommer,et al.  Algebraic Frames for the Perception-Action Cycle , 2000, Lecture Notes in Computer Science.

[28]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Tony Lindeberg,et al.  On Automatic Selection of Temporal Scales in Time-Causal Scale-Space , 1997, AFPAC.

[30]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[31]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[32]  Tony Lindeberg,et al.  Linear Spatio-Temporal Scale-Space , 1997, Scale-Space.

[33]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[34]  James L. Crowley,et al.  A Probabilistic Sensor for the Perception and Recognition of Activities , 2000, ECCV.

[35]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[36]  Andrew P. Witkin,et al.  Scale-Space Filtering , 1983, IJCAI.