Explaining optical flow events with parameterized spatio-temporal models

A spatio-temporal representation for complex optical flow events is developed that generalizes traditional parameterized motion models (e.g. affine). These generative spatio-temporal models may be non-linear or stochastic and are event-specific in that they characterize a particular type of object motion (e.g. sitting or walking). Within a Bayesian framework we seek the appropriate model, phase, rate, spatial position, and scale to account for the image variation. The posterior distribution over this parameter space conditioned on image measurements is typically non-Gaussian. The distribution is represented using factored sampling and is predicted and updated over time using the condensation algorithm. The resulting framework automatically detects, localizes, and recognizes motion events.

[1]  R. D. Lockhart,et al.  The Human Figure in Motion , 1957 .

[2]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[3]  Alan L. Yuille,et al.  Visual motion estimation and prediction: a probabilistic network model for temporal coherence , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[5]  David Suter,et al.  Optic flow calculation using robust statistics , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[8]  Michael J. Black,et al.  A Probabilistic Framework for Matching Temporal Trajectories: CONDENSATION-Based Recognition of Gestures and Expressions , 1998, ECCV.

[9]  Edward H. Adelson,et al.  Analyzing and recognizing walking figures in XYT , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  J. Little,et al.  Recognizing People by Their Gait: The Shape of Motion , 1998 .

[11]  Eadweard Muybridge,et al.  The Human Figure in Motion , 1955 .

[12]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[13]  Fang Liu,et al.  Finding periodicity in space and time , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[14]  Michael J. Black,et al.  Cardboard people: A parametrized model of articulated motion , 1996 .

[15]  David J. Fleet,et al.  Learning parameterized models of image motion , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[17]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[18]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[19]  Larry S. Davis,et al.  Learned temporal models of image motion , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  Randal C. Nelson,et al.  Detecting activities , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.