3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions

A mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high level (action recognition) processes. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliable segment and maintain the tracking of moving objects before, during, and after occlusion. The 3D trajectory, occlusion, and segmentation information are utilized in extracting stabilized views of the moving object. Trajectory-guided recognition (TGR) is proposed as a new and efficient method for adaptive classification of action. The TGR approach is demonstrated using "motion history images" that are then recognized via a mixture of Gaussian classifier. The system was tested in recognizing various dynamic human outdoor activities; e.g., running, walking, roller blading, and cycling. Experiments with synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.

[1]  Andrew Blake,et al.  Learning Dynamics of Complex Motions from Image Sequences , 1996, ECCV.

[2]  Alex Pentland,et al.  Recursive Estimation of Motion, Structure, and Focal Length , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Michael J. Black,et al.  Cardboard people: A parametrized model of articulated motion , 1996 .

[4]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[5]  Alex Pentland,et al.  Recovery of non-rigid motion and structure , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[8]  Hugh F. Durrant-Whyte,et al.  A Fully Decentralized Multi-Sensor System For Tracking and Surveillance , 1993, Int. J. Robotics Res..

[9]  H. Sorenson Least-squares estimation: from Gauss to Kalman , 1970, IEEE Spectrum.

[10]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[11]  Richard Szeliski,et al.  Recovering 3D shape and motion from image streams using nonlinear least squares , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ian D. Reid,et al.  The Active Recovery of 3D Motion Trajectories and Their Use in Prediction , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  R. Okafor Maximum likelihood estimation from incomplete data , 1987 .

[14]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[15]  Rómer Rosales Recognition of Human Action Using Moment-Based Features , 1998 .

[16]  Edward H. Adelson,et al.  Analyzing and recognizing walking figures in XYT , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[17]  H. Hartley Maximum Likelihood Estimation from Incomplete Data , 1958 .

[18]  Mubarak Shah,et al.  Motion-Based Recognition , 1997, Computational Imaging and Vision.

[19]  Larry S. Davis,et al.  Tracking of humans in action: a 3-D model-based approach , 1996 .

[20]  Rama Chellappa,et al.  Estimating the Kinematics and Structure of a Rigid Object from a Sequence of Monocular Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Stan Sclaroff,et al.  Improved Tracking of Multiple Humans with Trajectory Predcition and Occlusion Modeling , 1998 .

[22]  Alex Pentland,et al.  Classifying Hand Gestures with a View-Based Distributed Representation , 1993, NIPS.

[23]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[25]  Stan Sclaroff,et al.  Trajectory Guided Tracking and Recognition of Actions , 1999 .

[26]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  James W. Davis,et al.  Real-time closed-world tracking , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  David C. Hogg,et al.  Learning Flexible Models from Image Sequences , 1994, ECCV.

[29]  Larry S. Davis,et al.  W4S : A real-time system for detecting and tracking people in 2 D , 1998, eccv 1998.

[30]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[31]  P PentlandAlex,et al.  Recursive Estimation of Motion, Structure, and Focal Length , 1995 .

[32]  Stan Sclaroff,et al.  Trajectory guided recognition of actions , 1999, Optics East.

[33]  Ioannis A. Kakadiaris,et al.  Active part-decomposition, shape and motion estimation of articulated objects: a physics-based approach , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[34]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[35]  T. Kohonen Self-organized formation of topology correct feature maps , 1982 .

[36]  James W. Davis,et al.  An appearance-based representation of action , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[37]  R. Nelson,et al.  Low level recognition of human motion (or how to get your man without finding his body parts) , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.