Trajectory-Based Representation of Human Actions

This work addresses the problem of human action recognition by introducing a representation of a human action as a collection of short trajectories that are extracted in areas of the scene with significant amount of visual activity. The trajectories are extracted by an auxiliary particle filtering tracking scheme that is initialized at points that are considered salient both in space and time. The spatiotemporal salient points are detected by measuring the variations in the information content of pixel neighborhoods in space and time. We implement an online background estimation algorithm in order to deal with inadequate localization of the salient points on the moving parts in the scene, and to improve the overall performance of the particle filter tracking scheme.We use a variant of the Longest Common Subsequence algorithm (LCSS) in order to compare different sets of trajectories corresponding to different actions. We use Relevance Vector Machines (RVM) in order to address the classification problem. We propose new kernels for use by the RVM, which are specifically tailored to the proposed representation of short trajectories. The basis of these kernels is the modified LCSS distance of the previous step. We present results on real image sequences from a small database depicting people performing 12 aerobic exercises.

[1]  Arnold W. M. Smeulders,et al.  Robust Tracking Using Foreground-Background Texture Discrimination , 2006, International Journal of Computer Vision.

[2]  Alex Pentland,et al.  Human computing and machine understanding of human behavior: a survey , 2006, ICMI '06.

[3]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Larry S. Davis,et al.  Fast multiple object tracking via a hierarchical particle filter , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Yen-Wei Chen,et al.  Articulated hand tracking by PCA-ICA approach , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[6]  Linda G. Shapiro,et al.  Computer and Robot Vision , 1991 .

[7]  Sameer Singh,et al.  Video analysis of human dynamics - a survey , 2003, Real Time Imaging.

[8]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  I. Patras,et al.  Spatiotemporal salient points for visual recognition of human actions , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Qiang Ji,et al.  Information extraction from image sequences of real-world facial expressions , 2005, Machine Vision and Applications.

[13]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[14]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[15]  Gang Hua,et al.  Tracking articulated body by dynamic Markov network , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Y. Bar-Shalom Tracking and data association , 1988 .

[17]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[18]  Yuanxin Wu,et al.  Unscented Kalman filtering for additive noise case: augmented versus nonaugmented , 2005, IEEE Signal Processing Letters.

[19]  Qiang Ji,et al.  Spatio-Temporal Context for Robust Multitarget Tracking , 2007 .

[20]  Yang Song,et al.  Unsupervised Learning of Human Motion , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Pascal Fua,et al.  Robust People Tracking with Global Trajectory Optimization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Alex Pentland,et al.  Human Computing and Machine Understanding of Human Behavior: A Survey , 2007, Artifical Intelligence for Human Computing.

[23]  Thomas B. Moeslund,et al.  A brief overview of hand gestures used in wearable human computer interfaces , 2003 .

[24]  Michael Isard,et al.  Tracking loose-limbed people , 2004, CVPR 2004.

[25]  Larry S. Davis,et al.  Exemplar-based tracking and recognition of arm gestures , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[26]  Rainer Stiefelhagen,et al.  Head pose estimation using stereo vision for human-robot interaction , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[27]  Pietro Perona,et al.  Human action recognition by sequence of movelet codewords , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[28]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[29]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[30]  Maja Pantic,et al.  Tracking deformable motion , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[31]  Linda G. Shapiro,et al.  Computer and Robot Vision (Volume II) , 2002 .

[32]  Pietro Perona,et al.  Hybrid models for human motion recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[34]  Ian Reid,et al.  Probabilistic tracking and recognition of nonrigid hand motion , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[35]  Yueting Zhuang,et al.  A two-step approach to multiple facial feature tracking: temporal particle filter and spatial belief propagation , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[36]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[37]  Oswald Lanz,et al.  Approximate Bayesian multibody tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[39]  Tieniu Tan,et al.  Real time hand tracking by combining particle filtering and mean shift , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[40]  Qiang Ji,et al.  Active and dynamic information fusion for facial expression understanding from image sequences , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[42]  M. Pitt,et al.  Filtering via Simulation: Auxiliary Particle Filters , 1999 .

[43]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[44]  Shai Avidan Ensemble Tracking , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Andrew Blake,et al.  Tracking through singularities and discontinuities by random sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[46]  Ramakant Nevatia,et al.  Tracking multiple humans in crowded environment , 2004, CVPR 2004.

[47]  Yi-Ping Hung,et al.  Appearance-guided particle filtering for articulated hand tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[48]  Jr. J.J. LaViola,et al.  A comparison of unscented and extended Kalman filtering for estimating quaternion motion , 2003, Proceedings of the 2003 American Control Conference, 2003..

[49]  Ricardo M. L. Barros,et al.  Tracking markers for human motion analysis , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[50]  Tony X. Han,et al.  Efficient Nonparametric Belief Propagation with Application to Articulated Body Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[51]  Marcel J. T. Reinders,et al.  Influence of the observation likelihood function on particle filtering performance in tracking applications , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[52]  Tanveer F. Syeda-Mahmood,et al.  View-invariant alignment and matching of video sequences , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[53]  Rainer Stiefelhagen,et al.  3D-tracking of head and hands for pointing gesture recognition in a human-robot interaction scenario , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[54]  Maja Pantic,et al.  Particle filtering with factorized likelihoods for tracking facial features , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[55]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[56]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[57]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Jannik Fritsch,et al.  Kernel particle filter for real-time 3D body tracking in monocular color images , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[59]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[60]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[61]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[62]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[63]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[64]  Takahiro Ishikawa,et al.  The template update problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[66]  Jeffrey K. Uhlmann,et al.  Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[67]  George Kollios,et al.  Extraction and clustering of motion trajectories in video , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[68]  Rashid Ansari,et al.  Multiple object tracking with kernel particle filter , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[69]  M. Brady,et al.  Scale Saliency: a novel approach to salient feature and scale selection , 2003 .