Advances in View-Invariant Human Motion Analysis: A Review

As viewpoint issue is becoming a bottleneck for human motion analysis and its application, in recent years, researchers have been devoted to view-invariant human motion analysis and have achieved inspiring progress. The challenge here is to find a methodology that can recognize human motion patterns to reach increasingly sophisticated levels of human behavior description. This paper provides a comprehensive survey of this significant research with the emphasis on view-invariant representation, and recognition of poses and actions. In order to help readers understand the integrated process of visual analysis of human motion, this paper presents recent development in three major issues involved in a general human motion analysis system, namely, human detection, view-invariant pose representation and estimation, and behavior understanding. Public available standard datasets are recommended. The concluding discussion assesses the progress so far, and outlines some research challenges and future directions, and solution to what is essential to achieve the goals of human motion analysis.

[1]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[2]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[3]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[4]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[6]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[7]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[8]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yiannis Aloimonos,et al.  View-Invariant Modeling and Recognition of Human Actions Using Grammars , 2006, WDV.

[10]  F. Daum Nonlinear filters: beyond the Kalman filter , 2005, IEEE Aerospace and Electronic Systems Magazine.

[11]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Pascal Fua,et al.  Robust tracking and segmentation of human motion in an image sequence , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[14]  Michael Isard,et al.  Tracking loose-limbed people , 2004, CVPR 2004.

[15]  Josechu J. Guerrero,et al.  Viewpoint Independent Human Motion Analysis in Man-made Environments , 2006, BMVC.

[16]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Martin A. Giese,et al.  Combining View-Based and Model-Based Tracking of Articulated Human Movements , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[18]  Andrew Blake,et al.  Towards the automatic analysis of complex human body motions , 2002, Image Vis. Comput..

[19]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[20]  Irfan A. Essa,et al.  Learning Temporal Sequence Model from Partially Labeled Data , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Svetha Venkatesh,et al.  Tracking-as-Recognition for Articulated Full-Body Human Motion Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yongdong Zhang,et al.  Automatic Video-based Analysis of Athlete Action , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[23]  M. Shah,et al.  Actions As Objects : A Novel Action Representation , 2005 .

[24]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Juan Carlos Niebles,et al.  A Hierarchical Model of Shape and Appearance for Human Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[28]  Roman Goldenberg,et al.  A real-time system for classification of moving objects , 2002, Object recognition supported by user interaction for service robots.

[29]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[30]  Ramakant Nevatia,et al.  Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost , 2006, ECCV.

[31]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[32]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Bruno Raffin,et al.  3D Skeleton-Based Body Pose Recovery , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[34]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[35]  Ralph Gross,et al.  The CMU Motion of Body (MoBo) Database , 2001 .

[36]  Larry S. Davis,et al.  Learning dynamics for exemplar-based gesture recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[37]  M. Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, CVPR 2004.

[38]  Rama Chellappa,et al.  Towards a view invariant gait recognition algorithm , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[39]  Stefan Carlsson,et al.  Monocular 3D Reconstruction of Human Motion in Long Action Sequences , 2004, ECCV.

[40]  Ehud Rivlin,et al.  Classification of Moving Targets Based on Motion and Appearance , 2003, BMVC.

[41]  Isaac Cohen,et al.  Inference of human postures by classification of 3D human body shape , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[42]  Yihong Gong,et al.  Latent Pose Estimator for Continuous Action Recognition , 2008, ECCV.

[43]  Michael J. Black,et al.  A Quantitative Evaluation of Video-based 3D Person Tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[44]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[45]  Cristian Sminchisescu 3D Human Motion Analysis in Monocular Video Techniques and Challenges , 2006, AVSS.

[46]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[48]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[49]  MetaxasDimitris,et al.  Conditional models for contextual human motion recognition , 2006 .

[50]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[51]  François Brémond,et al.  Video-understanding framework for automatic behavior recognition , 2006, Behavior research methods.

[52]  Gary R. Bradski,et al.  Motion segmentation and pose recognition with motion history gradients , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[53]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[54]  Larry S. Davis,et al.  Efficient Kernel Density Estimation Using the Fast Gauss Transform with Applications to Color Modeling and Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Bohyung Han,et al.  SEQUENTIAL KERNEL DENSITY APPROXIMATION THROUGH MODE PROPAGATION: APPLICATIONS TO BACKGROUND MODELING , 2004 .

[56]  Ramakant Nevatia,et al.  View and scale invariant action recognition using multiview shape-flow models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[58]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[59]  Jake K. Aggarwal,et al.  Semantic-level Understanding of Human Actions and Interactions using Event Hierarchy , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[60]  Yongdong Zhang,et al.  Automatic Detection and Recognition of Athlete Actions in Diving Video , 2007, MMM.

[61]  Osama Masoud,et al.  A method for human action recognition , 2003, Image Vis. Comput..

[62]  Jenq-Neng Hwang,et al.  Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks , 2003, Comput. Vis. Image Underst..

[63]  Rama Chellappa,et al.  View invariants for human action recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[64]  Taisuke Sato,et al.  Bayesian classification of task-oriented actions based on stochastic context-free grammar , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[65]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Motion Capture Using Voxel Data , 2002, AMDO.

[66]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[67]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[68]  David A. Forsyth,et al.  Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis , 2005, Found. Trends Comput. Graph. Vis..

[69]  Rémi Ronfard,et al.  Automatic Discovery of Action Taxonomies from Multiple Views , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[70]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[71]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[72]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[73]  Quming Zhou,et al.  Tracking and Classifying Moving Objects from Video , 2001 .

[74]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Mubarak Shah,et al.  Tracking and Object Classification for Automated Surveillance , 2002, ECCV.

[76]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[77]  Nicholas R. Howe,et al.  Silhouette Lookup for Automatic Pose Tracking , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[78]  Xiao Li,et al.  Human motion recognition based on neural network , 2005, Proceedings. 2005 International Conference on Communications, Circuits and Systems, 2005..

[79]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[80]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[81]  Ahmed M. Elgammal,et al.  Simultaneous Inference of View and Body Pose using Torus Manifolds , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[82]  Ian D. Reid,et al.  Behaviour understanding in video: a combined method , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[83]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[84]  Stefano Soatto,et al.  Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[85]  Toby Howard,et al.  Real-Time 3-D Human Body Tracking using Variable Length Markov Models , 2005, BMVC.

[86]  Sergio A. Velastin,et al.  Automatic congestion detection system for underground platforms , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[87]  Mubarak Shah,et al.  Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[88]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[89]  Rama Chellappa,et al.  View independent human body pose estimation from a single perspective image , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[90]  Vincent Lepetit,et al.  Bridging the Gap between Detection and Tracking for 3D Monocular Video-Based Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[92]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[93]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[94]  Tieniu Tan,et al.  Modelling the Effect of View Angle Variation on Appearance-Based Gait Recognition , 2006, ACCV.

[95]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[96]  Masamichi Shimosaka,et al.  Hierarchical recognition of daily human actions based on Continuous Hidden Markov Models , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[97]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[98]  Adrian Hilton,et al.  Viewpoint invariant exemplar-based 3D human tracking , 2006, Comput. Vis. Image Underst..

[99]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[100]  Pedro Ribeiro,et al.  Human Activity Recognition from Video: modeling, feature selection and classification architecture , 2005 .

[101]  Vasudev Parameswaran,et al.  View-Invariance in Visual Human Motion Analysis , 2004 .

[102]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[103]  Mubarak Shah,et al.  Detecting and segmenting humans in crowded scenes , 2007, ACM Multimedia.

[104]  Hsuan-Sheng Chen,et al.  Human action recognition using star skeleton , 2006, VSSN '06.