Learned Models for Estimation of Rigid and Articulated Human Motion from Stationary or Moving Camera

We propose an approach for modeling, measurement and tracking of rigid and articulated motion as viewed from a stationary or moving camera. We first propose an approach for learning temporal-flow models from exemplar image sequences. The temporal-flow models are represented as a set of orthogonal temporal-flow bases that are learned using principal component analysis of instantaneous flow measurements. Spatial constraints on the temporal-flow are then incorporated to model the movement of regions of rigid or articulated objects. These spatio-temporal flow models are subsequently used as the basis for simultaneous measurement and tracking of brightness motion in image sequences. Then we address the problem of estimating composite independent object and camera image motions. We employ the spatio-temporal flow models learned through observing typical movements of the object from a stationary camera to decompose image motion into independent object and camera motions. The performance of the algorithms is demonstrated on several long image sequences of rigid and articulated bodies in motion.

[1]  Stuart Geman,et al.  Statistical methods for tomographic image reconstruction , 1987 .

[2]  David J. Fleet,et al.  Learning parameterized models of image motion , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Roberto Battiti,et al.  Computing optical flow across multiple scales: An adaptive coarse-to-fine strategy , 1991, International Journal of Computer Vision.

[4]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[6]  W. James MacLean,et al.  Recovery of Egomotion and Segmentation of Independent Object Motion Using the EM Algorithm , 1994, BMVC.

[7]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[8]  Takuya Kondo,et al.  Incremental tracking of human actions from multiple views , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[9]  T. Boult,et al.  Factorization-based segmentation of motions , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[10]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[11]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[12]  Gilad Adiv,et al.  Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[14]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[15]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[16]  Yiannis Aloimonos,et al.  Qualitative egomotion , 1995, International Journal of Computer Vision.

[17]  P. Anandan,et al.  A unified approach to moving object detection in 2D and 3D scenes , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[18]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[19]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[20]  Stephen Alan Underwood Visual learning and recognition by computer , 1972 .

[21]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Larry S. Davis,et al.  Learned temporal models of image motion , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[23]  H. C. Longuet-Higgins,et al.  The interpretation of a moving retinal image , 1980, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[24]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[25]  Larry S. Davis,et al.  Temporal multi-scale models for flow and acceleration , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[27]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[28]  Takeo Kanade,et al.  A multi-body factorization method for motion analysis , 1995, Proceedings of IEEE International Conference on Computer Vision.

[29]  Larry S. Davis,et al.  Ghost: a human body part labeling system using silhouettes , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[30]  P. Anandan,et al.  A Unified Approach to Moving Object Detection in 2D and 3D Scenes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Larry S. Davis,et al.  What can projections of flow fields tell us about the visual motion , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  Mubarak Shah,et al.  Recovering 3D Motion of Multiple Objects Using Adaptive Hough Transform , 1997, IEEE Trans. Pattern Anal. Mach. Intell..