Learning pixel-wise signal energy for understanding semantics

Abstract Visual interpretation of events requires both an appropriate representation of change occurring in the scene and the application of semantics for differentiating between different types of change. Conventional approaches for tracking objects and modelling object dynamics make use of either temporal region-correlation or pre-learnt shape or appearance models. We propose a new pixel-level approach for learning the temporal characteristics of change at individual pixels. Gaussian mixture models are used to model slow long-term changes in pixel distributions while pixel energy histories are used to extract fast-change signatures from short-term events and modelled by CONDENSATION matching.

[1]  Shaogang Gong,et al.  Recognition of temporal structures: Learning prior and propagating observation augmented densities via hidden Markov states , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[3]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[4]  Michael S. Landy,et al.  Computational models of visual processing , 1991 .

[5]  James L. Crowley,et al.  A probabilistic sensor for the perception of activities , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[6]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Marius Usher,et al.  Visual synchrony affects binding and segmentation in perception , 1998, Nature.

[8]  David C. Hogg,et al.  Learning the distribution of object trajectories for event recognition , 1996, Image Vis. Comput..

[9]  Alan V. Oppenheim,et al.  Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  Shaogang Gong,et al.  Tracking and segmenting people in varying lighting conditions using colour , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[11]  M. Landy,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[12]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[13]  Azriel Rosenfeld,et al.  Tracking Groups of People , 2000, Comput. Vis. Image Underst..

[14]  W. Eric L. Grimson,et al.  Using adaptive tracking to classify and monitor activities in a site , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).