Invariant Human Pose Feature Extraction for Movement Recognition and Pose Estimation

Reliable extraction of human pose features that are invariant to view angle and body shape changes is critical for advancing human movement analysis. In this dissertation, the multifactor analysis techniques, including the multilinear analysis and the multifactor Gaussian process methods, have been exploited to extract such invariant pose features from video data by decomposing various key contributing factors, such as pose, view angle, and body shape, in the generation of the image observations. Experimental results have shown that the resulting pose features extracted using the proposed methods exhibit excellent invariance properties to changes in view angles and body shapes. Furthermore, using the proposed invariant multifactor pose features, a suite of simple while effective algorithms have been developed to solve the movement recognition and pose estimation problems. Using these proposed algorithms, excellent human movement analysis results have been obtained, and most of them are superior to those obtained from state-of-the-art algorithms on the same testing datasets. Moreover, a number of key movement analysis challenges, including robust online gesture spotting and multi-camera gesture recognition, have also been addressed in this research. To this end, an online gesture spotting framework has been developed to automatically detect and learn non-gesture movement patterns to improve gesture localization and recognition from continuous data streams using a hidden Markov network. In addition, the optimal data fusion scheme has been investigated for multicamera gesture recognition, and the decision-level camera fusion scheme using the product rule has been found to be optimal for gesture recognition using multiple uncalibrated cameras. Furthermore, the challenge of optimal camera selection in multi-camera gesture recognition has also been tackled. A measure to quantify the complementary strength across cameras has been proposed. Experimental results obtained from a real-life gesture recognition dataset have shown that the optimal camera combinations identified according to the proposed complementary measure always lead to the best gesture recognition results.

[1]  A. Kak,et al.  A Look-up Table Based Approach for Solving the Camera Selection Problem in Large Camera Networks , 2006 .

[2]  Mun Wai Lee,et al.  Integrating component cues for human pose tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[3]  Rui Li,et al.  Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers , 2006, ECCV.

[4]  Gang Qian,et al.  Dance posture recognition using wide-baseline orthogonal stereo cameras , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[5]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[8]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[9]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, ACM Trans. Graph..

[10]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Mubarak Shah,et al.  Learning 4D action feature models for arbitrary view action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Bernd Girod,et al.  Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[15]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[17]  Jérémie Allard,et al.  The GrImage Platform: A Mixed Reality Environment for Interactions , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[18]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[19]  Daniela Calvetti,et al.  Matrix methods in data mining and pattern recognition , 2009, Math. Comput..

[20]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[21]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[22]  Stan Sclaroff,et al.  A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ronald Poppe,et al.  Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets , 2007 .

[24]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[25]  Xinghua Sun,et al.  Action recognition via local descriptors and holistic features , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[26]  B. Triggs,et al.  Tracking Articulated Motion with Piecewise Learned Dynamical Models , 2004 .

[27]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[28]  Vladimir Pavlovic,et al.  Central Subspace Dimensionality Reduction Using Covariance Operators , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Liang Wang,et al.  Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition , 2007, IEEE Transactions on Image Processing.

[30]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[31]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[32]  Bir Bhanu,et al.  Utility-based dynamic camera assignment and hand-off in a video network , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[33]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Tracking Using Voxel Data , 2003, International Journal of Computer Vision.

[35]  Bir Bhanu,et al.  Task-oriented camera assignment in a video network , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[36]  Hui Gao,et al.  An expressive three-mode principal components model of human action style , 2003, Image Vis. Comput..

[37]  Cristian Sminchisescu,et al.  Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.

[38]  Huang Lee,et al.  Principal view determination for camera selection in distributed smart camera networks , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[39]  Francis K. H. Quek,et al.  Using vision based tracking to support real-time graphical instruction for students who have visual impairments , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[40]  Huosheng Hu,et al.  Wearable inertial sensors for arm motion tracking in home-based rehabilitation , 2006, IAS.

[41]  Seong-Whan Lee,et al.  Gesture Spotting and Recognition for Human–Robot Interaction , 2007, IEEE Transactions on Robotics.

[42]  Hassan Foroosh,et al.  View-Invariant Action Recognition from Point Triplets , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Feiyue Huang,et al.  Viewpoint Insensitive Posture Representation for Action Recognition , 2006, AMDO.

[44]  David J. Fleet,et al.  Multifactor Gaussian process models for style-content separation , 2007, ICML '07.

[45]  Isaac Cohen,et al.  Inference of human postures by classification of 3D human body shape , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[46]  Tao Ding A robust identification approach to gait recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[48]  Gerhard Rigoll,et al.  Hidden Markov model based continuous online gesture recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[49]  Yoshiaki Shirai,et al.  Gesture based human-robot interaction using a frame based software platform , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[50]  Loren Olson,et al.  A gesture-driven multimodal interactive dance system , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[51]  Loren Olson,et al.  Movement-based interactive dance performance , 2006, MM '06.

[52]  H. Kiers An alternating least squares algorithms for PARAFAC2 and three-way DEDICOM , 1993 .

[53]  Junxia Gu,et al.  Action and Gait Recognition From Recovered 3-D Human Joints , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[54]  Iain Murray Introduction To Gaussian Processes , 2008 .

[55]  Josef Kittler,et al.  Combining multiple classifiers by averaging or by multiplying? , 2000, Pattern Recognit..

[56]  Mubarak Shah,et al.  Incremental action recognition using feature-tree , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[57]  Aaron F. Bobick,et al.  Action recognition using probabilistic parsing , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[58]  Sven Wachsmuth,et al.  Coordinating interactive vision behaviors for cognitive assistance , 2007, Comput. Vis. Image Underst..

[59]  Seong-Whan Lee,et al.  Robust Spotting of Key Gestures from Whole Body Motion Sequence , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[60]  Hassan Foroosh,et al.  View-invariant action recognition using fundamental ratios , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[62]  Manuel Graña,et al.  Real-time optical markerless tracking for augmented reality applications , 2010, Journal of Real-Time Image Processing.

[63]  Gang Qian,et al.  View-invariant full-body gesture recognition from video , 2008, 2008 19th International Conference on Pattern Recognition.

[64]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Aimin Hao,et al.  View-invariant action recognition using interest points , 2008, MIR '08.

[66]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Roberto Cipolla,et al.  Hierarchical Part-Based Human Body Pose Estimation , 2005, BMVC.

[68]  Liefeng Bo,et al.  Structured output-associative regression , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Yung-Yaw Chen,et al.  Human Posture Recognition by Simple Rules , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[70]  S. Sclaroff,et al.  Tracking Human Body Pose on a Learned Smooth Space , 2005 .

[71]  Gang Qian,et al.  Online Gesture Spotting from Visual Hull Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Hang Joon Kim,et al.  Vision-Based Game Interface Using Human Gesture , 2006, PSIVT.

[73]  Darius Burschka,et al.  VICs: A modular HCI framework using spatiotemporal dynamics , 2004, Machine Vision and Applications.

[74]  Augusto Sarti,et al.  Clustering of human actions using invariant body shape descriptor and dynamic time warping , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[75]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[76]  Javier Ruiz-del-Solar,et al.  Real-Time Hand Gesture Detection and Recognition Using Boosted Classifiers and Active Learning , 2007, PSIVT.

[77]  Mubarak Shah,et al.  Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  Hui Gao,et al.  Recognizing human action efforts: an adaptive three-mode PCA framework , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[80]  Ahmed M. Elgammal,et al.  Modeling View and Posture Manifolds for Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[81]  Hassan Foroosh,et al.  View-invariant recognition of body pose from space-time templates , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[82]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[83]  Rama Chellappa,et al.  Model driven segmentation and registration of articulating humans in Laplacian Eigenspace , 2006 .

[84]  T.V. Sreenivas,et al.  Multi Pattern Dynamic Time Warping for automatic speech recognition , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[85]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[86]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[87]  Rómer Rosales,et al.  Learning Body Pose via Specialized Maps , 2001, NIPS.

[88]  Maja Pantic,et al.  Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences , 2011, IEEE Transactions on Image Processing.

[89]  Jinchang Ren,et al.  Immersive and perceptual human-computer interaction using computer vision techniques , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[90]  Kar-Han Tan,et al.  Augmented reality for immersive remote collaboration , 2011, 2011 IEEE Workshop on Person-Oriented Vision.

[91]  Yun Yuan,et al.  Posture and Activity Recognition Using Projection Histogram and PCA Methods , 2008, 2008 Congress on Image and Signal Processing.

[92]  Mubarak Shah,et al.  Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Todd Ingalls,et al.  Real-time Gesture Recognition with Minimal Training Requirements and On-line Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[94]  Jun-Wei Hsieh,et al.  Segmentation of Human Body Parts Using Deformable Triangulation , 2006, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[95]  Pramod K. Varshney,et al.  Multisensor Data Fusion , 1997, IEA/AIE.

[96]  Isaac Cohen,et al.  Posture and Gesture Recognition using 3D Body Shapes Decomposition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[97]  Jitendra Malik,et al.  Image and video segmentation: the normalized cut framework , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[98]  Peyman Milanfar,et al.  Action Recognition from One Example , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99]  Huang Lee,et al.  Sub-optimal Camera Selection in Practical Vision Networks through Shape Approximation , 2008, ACIVS.

[100]  Gang Qian,et al.  HMM parameter reduction for practical gesture recognition , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[101]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[102]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[103]  Stan Sclaroff,et al.  Simultaneous Localization and Recognition of Dynamic Hand Gestures , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[104]  Nicholas R. Howe,et al.  Silhouette lookup for monocular 3D pose tracking , 2007, Image Vis. Comput..

[105]  Demetri Terzopoulos,et al.  TensorTextures: multilinear image-based rendering , 2004, ACM Trans. Graph..

[106]  Huei-Yung Lin,et al.  Augmented Reality with Human Body Interaction Based on Monocular 3D Pose Estimation , 2010, ACIVS.

[107]  Gang Qian,et al.  View-invariant full-body gesture recognition via multilinear analysis of voxel data , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[108]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[109]  Seong-Whan Lee Automatic gesture recognition for intelligent human-robot interaction , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[110]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[111]  Sergios Theodoridis,et al.  Recognition of isolated musical patterns using Context Dependent Dynamic Time Warping , 2002, 2002 11th European Signal Processing Conference.

[112]  Rui Li,et al.  3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers , 2009, International Journal of Computer Vision.

[113]  Jin Kjölberg,et al.  Designing full body movement interaction using modern dance as a starting point , 2004, DIS '04.

[114]  Kosuke Sato,et al.  Real-time gesture recognition by learning and selective control of visual interest points , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115]  Gregory D. Hager,et al.  Gesture Recognition Using 3D Appearance and Motion Features , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[116]  Hongping Cai,et al.  Learning Linear Discriminant Projections for Dimensionality Reduction of Image Descriptors , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[117]  Shuji Hashimoto,et al.  EyesWeb: Toward Gesture and Affect Recognition in Interactive Dance and Music Systems , 2000, Computer Music Journal.

[118]  V. Ramasubramanian,et al.  Towards fast, view-invariant human action recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[119]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[120]  M. Alex O. Vasilescu Human motion signatures: analysis, synthesis, recognition , 2002, Object recognition supported by user interaction for service robots.

[121]  James Llinas,et al.  Handbook of Multisensor Data Fusion : Theory and Practice, Second Edition , 2008 .

[122]  James Llinas,et al.  Multisensor Data Fusion , 1990 .

[123]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[124]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[125]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[126]  John Billingsley,et al.  Mechatronics and machine vision in practice , 2008 .

[127]  Avinash C. Kak,et al.  Distributed and lightweight multi-camera human activity classification , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[128]  Xu Zhao,et al.  Generative tracking of 3D human motion by hierarchical annealed genetic algorithm , 2008, Pattern Recognit..

[129]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[130]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[131]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[132]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[133]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[134]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[135]  David Birchfield,et al.  SMALLab: a mediated platform for education , 2006, SIGGRAPH '06.

[136]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[137]  Odest Chadwicke Jenkins,et al.  Tracking human motion and actions for interactive robots , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[138]  Rama Chellappa,et al.  View Invariance for Human Action Recognition , 2005, International Journal of Computer Vision.

[139]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[140]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[141]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[142]  Ahmed M. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[143]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[144]  F. Xavier Roca,et al.  Action-specific motion prior for efficient Bayesian 3D human body tracking , 2009, Pattern Recognit..

[145]  Ahmed M. Elgammal,et al.  Coupled Visual and Kinematic Manifold Models for Tracking , 2010, International Journal of Computer Vision.

[146]  Marcel J. T. Reinders,et al.  Sign Language Recognition by Combining Statistical DTW and Independent Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[147]  Huang Lee,et al.  Optimal camera selection in vision networks for shape approximation , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[148]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[149]  Thomas B. Moeslund,et al.  View invariant gesture recognition using 3D motion primitives , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[150]  Andrew Gilbert,et al.  Action Recognition Using Mined Hierarchical Compound Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[151]  Mubarak Shah,et al.  Recognizing human actions in videos acquired by uncalibrated moving cameras , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[152]  David J. Kriegman,et al.  A Real-Time Approach to the Spotting, Representation, and Recognition of Hand Gestures for Human-Computer Interaction , 2002, Comput. Vis. Image Underst..

[153]  Wendi B. Heinzelman,et al.  Camera selection in visual sensor networks , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.