Interest Point Detection and Scale Selection in Space-Time

Several types of interest point detectors have been proposed for spatial images. This paper investigates how this notion can be generalised to the detection of interesting events in space-time data. Moreover, we develop a mechanism for spatio-temporal scale selection and detect events at scales corresponding to their extent in both space and time. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect regions in space-time where the image structures have significant local variations in both space and time. In this way, events that correspond to curved space-time structures are emphasised, while structures with locally constant motion are disregarded. To construct this operator, we start from a multi-scale windowed second moment matrix in space-time, and combine the determinant and the trace in a similar way as for the spatial Harris operator. All spacetime maxima of this operator are then adapted to characteristic scales by maximising a scale-normalised space-time Laplacian operator over both spatial scales and temporal scales. The motivation for performing temporal scale selection as a complement to previous approaches of spatial scale selection is to be able to robustly capture spatio-temporal events of different temporal extent. It is shown that the resulting approach is truly scale invariant with respect to both spatial scales and temporal scales. The proposed concept is tested on synthetic and real image sequences. It is shown that the operator responds to distinct and stable points in space-time that often correspond to interesting events. The potential applications of the method are discussed.

[1]  Andrew P. Witkin,et al.  Scale-Space Filtering , 1983, IJCAI.

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Johan Wiklund,et al.  Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Andrea J. van Doorn,et al.  Generic Neighborhood Operators , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Stephen M. Smith,et al.  ASSET-2: Real-Time Motion Segmentation and Shape Tracking , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Sourabh A. Niyogi,et al.  Detecting kinetic occlusion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[7]  Stephen M. Smith,et al.  ASSET-2: real-time motion segmentation and shape tracking , 1995, Proceedings of IEEE International Conference on Computer Vision.

[8]  Tony Lindeberg,et al.  Scale-Space with Casual Time Direction , 1996, ECCV.

[9]  T. Lindeberg Scale-space with Causal Time Direction , 1996 .

[10]  Luc Florack,et al.  Image Structure , 1997, Computational Imaging and Vision.

[11]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Tony Lindeberg,et al.  On Automatic Selection of Temporal Scales in Time-Causal Scale-Space , 1997, AFPAC.

[13]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[14]  David J. Fleet,et al.  Motion feature detection using steerable flow fields , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[15]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Gerald Sommer,et al.  Algebraic Frames for the Perception-Action Cycle , 2000, Lecture Notes in Computer Science.

[17]  Tony Lindeberg,et al.  Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection , 2000, IEEE Trans. Image Process..

[18]  James L. Crowley,et al.  A Probabilistic Sensor for the Perception and Recognition of Activities , 2000, ECCV.

[19]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[20]  James L. Crowley,et al.  Object Recognition Using Coloured Receptive Fields , 2000, ECCV.

[21]  James L. Crowley,et al.  Local Scale Selection for Gaussian Based Description Techniques , 2000, ECCV.

[22]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[23]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[24]  Stefan Carlsson,et al.  Combining Appearance and Topology for Wide Baseline Matching , 2002, ECCV.

[25]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[26]  Lars Bretzner,et al.  Real-Time Scale Selection in Hybrid Multi-scale Representations , 2003, Scale-Space.

[27]  Ivan Laptev,et al.  Velocity adaptation of spatio-temporal receptive fields for direct recognition of activities: an experimental study , 2004, Image Vis. Comput..

[28]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[29]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[30]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[31]  Tony Lindeberg,et al.  Direct computation of shape cues using scale-adapted spatial derivative operators , 1996, International Journal of Computer Vision.

[32]  J. J. Koenderink,et al.  Scale-time , 1988, Biological Cybernetics.

[33]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[34]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.