Video-Based Human Behavior Understanding: A Survey

Understanding human behaviors is a challenging problem in computer vision that has recently seen important advances. Human behavior understanding combines image and signal processing, feature extraction, machine learning, and 3-D geometry. Application scenarios range from surveillance to indexing and retrieval, from patient care to industrial safety and sports analysis. Given the broad set of techniques used in video-based behavior understanding and the fast progress in this area, in this paper we organize and survey the corresponding literature, define unambiguous key terms, and discuss links among fundamental building blocks ranging from human detection to action and interaction recognition. The advantages and the drawbacks of the methods are critically discussed, providing a comprehensive coverage of key aspects of video-based human behavior understanding, available datasets for experimentation and comparisons, and important open research issues.

[1]  Sébastien Marcel,et al.  Hand gesture recognition using input-output hidden Markov models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Christopher Joseph Pal,et al.  Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Mubarak Shah,et al.  Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories , 2011, 2011 International Conference on Computer Vision.

[5]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[6]  Hassan Foroosh,et al.  Action recognition using rank-1 approximation of Joint Self-Similarity Volume , 2011, 2011 International Conference on Computer Vision.

[7]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) The MediaMill TRECVID 2008 semantic video search engine , 2008 .

[8]  Ho Gi Jung,et al.  A New Approach to Urban Pedestrian Detection for Automatic Braking , 2009, IEEE Transactions on Intelligent Transportation Systems.

[9]  Bernt Schiele,et al.  Learning people detection models from few training samples , 2011, CVPR 2011.

[10]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[12]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ramakant Nevatia,et al.  Cluster Boosted Tree Classifier for Multi-View, Multi-Pose Object Detection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Cordelia Schmid,et al.  Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[18]  Roland Mörzinger,et al.  Using Gait Features for Improving Walking People Detection , 2010, 2010 20th International Conference on Pattern Recognition.

[19]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Michael S. Ryoo,et al.  Human activity prediction: Early recognition of ongoing activities from streaming videos , 2011, 2011 International Conference on Computer Vision.

[22]  Qingshan Liu,et al.  Abnormal detection using interaction energy potentials , 2011, CVPR 2011.

[23]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[24]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[25]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Jake K. Aggarwal,et al.  Modeling human activities as speech , 2011, CVPR 2011.

[27]  H. Kelley,et al.  The social psychology of groups , 1960 .

[28]  Ram Nevatia,et al.  Detection and Segmentation of Multiple, Partially Occluded Objects by Grouping, Merging, Assigning Part Detection Responses , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Elisa Ricci,et al.  Space speaks: towards socially and personality aware visual surveillance , 2010, MPVA '10.

[30]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Jeffrey E. Boyd,et al.  Synchronization of oscillations for machine perception of gaits , 2004, Comput. Vis. Image Underst..

[32]  J. Little,et al.  Recognizing People by Their Gait: The Shape of Motion , 1998 .

[33]  Jinxiang Chai,et al.  Modeling 3D human poses from uncalibrated monocular images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Yupin Luo,et al.  Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems , 2009, IEEE Transactions on Intelligent Transportation Systems.

[35]  Matti Pietikäinen,et al.  Human Activity Recognition Using Sequences of Postures , 2005, MVA.

[36]  Dariu Gavrila,et al.  Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.

[37]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Zicheng Liu,et al.  Expandable Data-Driven Graphical Modeling of Human Actions Based on Salient Postures , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Bart Selman,et al.  Human Activity Detection from RGBD Images , 2011, Plan, Activity, and Intent Recognition.

[40]  Serge J. Belongie,et al.  Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning ? , 2008 .

[41]  Wen Gao,et al.  Contour-motion feature (CMF): A space-time approach for robust pedestrian detection , 2009, Pattern Recognit. Lett..

[42]  Mubarak Shah,et al.  Cyclic motion detection for motion based recognition , 1994, Pattern Recognit..

[43]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[44]  Andrea Cavallaro,et al.  Interaction recognition in wide areas using audiovisual sensors , 2012, 2012 19th IEEE International Conference on Image Processing.

[45]  Paulo Vinicius,et al.  Blob Motion Statistics for Pedestrian Detection , 2011, 2011 International Conference on Digital Image Computing: Techniques and Applications.

[46]  Bertrand Vachon,et al.  Statistical Background Modeling for Foreground Detection: A Survey , 2010 .

[47]  Honghai Liu,et al.  Viewpoint Insensitive Actions Recognition Using Hidden Conditional Random Fields , 2010, KES.

[48]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[49]  Jodie A. Baird,et al.  Making sense of human behavior: Action parsing and intentional inference , 2001 .

[50]  Simone Calderara,et al.  Detecting anomalies in people's trajectories using spectral graph analysis , 2011, Comput. Vis. Image Underst..

[51]  Dong Han,et al.  Selection and context for action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[52]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[53]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Xin Li,et al.  Pedestrian detection and tracking in infrared imagery using shape and appearance , 2007, Comput. Vis. Image Underst..

[55]  Jian Zhang,et al.  Fast Pedestrian Detection Using a Cascade of Boosted Covariance Features , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  Baihua Li,et al.  Recognition of human periodic movements from unstructured information using a motion-based frequency domain approach , 2006, Image Vis. Comput..

[57]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[58]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[59]  Xiangjian He,et al.  Motion Based Pedestrian Recognition , 2008, 2008 Congress on Image and Signal Processing.

[60]  Rita Cucchiara,et al.  3DPeS: 3D people dataset for surveillance and forensics , 2011, J-HGBU '11.

[61]  Silvio Savarese,et al.  A Unified Framework for Multi-target Tracking and Collective Activity Recognition , 2012, ECCV.

[62]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[63]  Gregory D. Abowd,et al.  The Aware Home: A Living Laboratory for Ubiquitous Computing Research , 1999, CoBuild.

[64]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[65]  Li Li,et al.  Semantic event representation and recognition using syntactic attribute graph grammar , 2009, Pattern Recognit. Lett..

[66]  艾而帝,et al.  Microsoft Kinect 虛擬復健系統設計 , 2013 .

[67]  Bingbing Ni,et al.  Recognizing human group activities with localized causalities , 2009, CVPR 2009.

[68]  Shaogang Gong,et al.  Recognising action as clouds of space-time interest points , 2009, CVPR.

[69]  Jake K. Aggarwal,et al.  Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[70]  Massimo Piccardi,et al.  HMM-MIO: An enhanced hidden Markov model for action recognition , 2011, CVPR 2011 WORKSHOPS.

[71]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[72]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[73]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[75]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[76]  Jean-Marc Odobez,et al.  Multiperson Visual Focus of Attention from Head Pose and Meeting Contextual Cues , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[78]  Leonid Sigal,et al.  Poselet Key-Framing: A Model for Human Activity Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  A. Maslow Motivation and Personality , 1954 .

[80]  Nizar Bouguila,et al.  A nonparametric Bayesian approach for enhanced pedestrian detection and foreground segmentation , 2011, CVPR 2011 WORKSHOPS.

[81]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[82]  Andrea Cavallaro,et al.  Recognizing Interactions in Video , 2010, Intelligent Multimedia Analysis for Security Applications.

[83]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[84]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[85]  Silvio Savarese,et al.  Learning context for collective activity recognition , 2011, CVPR 2011.

[86]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[87]  Jason J. Corso,et al.  Action bank: A high-level representation of activity in video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[88]  Larry S. Davis,et al.  A Pose-Invariant Descriptor for Human Detection and Segmentation , 2008, ECCV.

[89]  Deva Ramanan,et al.  Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces , 2010, ECCV.

[90]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[91]  Barry G. Quinn,et al.  The Estimation and Tracking of Frequency , 2001 .

[92]  Daniel Snow,et al.  Pedestrian detection using boosted features over many frames , 2008, 2008 19th International Conference on Pattern Recognition.

[93]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[94]  Dacheng Tao,et al.  Slow Feature Analysis for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[95]  E. Hall,et al.  The Hidden Dimension , 1970 .

[96]  Francesco G. B. De Natale,et al.  Learning and matching human activities using regular expressions , 2010, 2010 IEEE International Conference on Image Processing.

[97]  Paulo Vinicius Koerich Borges,et al.  Pedestrian Detection Based on Blob Motion Statistics , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[98]  James M. Rehg,et al.  Learning to recognize objects in egocentric activities , 2011, CVPR 2011.

[99]  Larry S. Davis,et al.  Pedestrian Detection via Periodic Motion Analysis , 2007, International Journal of Computer Vision.

[100]  Iasonas Kokkinos,et al.  Discovering discriminative action parts from mid-level video representations , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[101]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[102]  Guillermo Sapiro,et al.  Sparse Modeling of Human Actions from Motion Imagery , 2012, International Journal of Computer Vision.

[103]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[104]  Henry Kautz,et al.  Human activity recognition in video: extending statistical features across time, space and semantic context , 2011 .

[105]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[106]  P. Molnár Social Force Model for Pedestrian Dynamics Typeset Using Revt E X 1 , 1995 .

[107]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[108]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[109]  Meng Wang,et al.  Automatic adaptation of a generic pedestrian detector to a specific traffic scene , 2011, CVPR 2011.

[110]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[111]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[112]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[113]  Mubarak Shah,et al.  Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[114]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[115]  Yang Wang,et al.  Hidden Part Models for Human Action Recognition: Probabilistic versus Max Margin , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[116]  Dimitris G. Manolakis,et al.  Statistical and Adaptive Signal Processing , 2000 .

[117]  Hossein Ragheb,et al.  MuHAVi: A Multicamera Human Action Video Dataset for the Evaluation of Action Recognition Methods , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[118]  Fernando De la Torre,et al.  Joint segmentation and classification of human actions in video , 2011, CVPR 2011.

[119]  Kaiqi Huang,et al.  An Extended Grammar System for Learning and Recognizing Complex Visual Events , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[120]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[121]  Paulo Peixoto,et al.  Semantic fusion of laser and vision in pedestrian detection , 2010, Pattern Recognit..

[122]  Luc Van Gool,et al.  Exploiting simple hierarchies for unsupervised human behavior analysis , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[123]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[124]  Lynne E. Parker,et al.  4-dimensional local spatio-temporal features for human activity recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[125]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[126]  Willem Doise,et al.  Social interaction in individual development , 2010 .

[127]  Bingbing Ni,et al.  RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, ICCV Workshops.

[128]  Vittorio Murino,et al.  Towards Computational Proxemics: Inferring Social Relations from Interpersonal Distances , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[129]  A. Howard,et al.  Detecting Pedestrians with Stereo Vision: Safe Operation of Autonomous Ground Vehicles in Dynamic Environments , 2007 .

[130]  Sridha Sridharan,et al.  Efficient Articulated Trajectory Reconstruction Using Dynamic Programming and Filters , 2012, ECCV.

[131]  Larry S. Davis,et al.  Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos , 2009, CVPR.

[132]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[133]  James Hays,et al.  Quality Assessment for Crowdsourced Object Annotations , 2011, BMVC.

[134]  Luc Van Gool,et al.  Improving Data Association by Joint Modeling of Pedestrian Trajectories and Groupings , 2010, ECCV.

[135]  E. Hall The Silent Language , 1959 .

[136]  Nicu Sebe,et al.  Real Time Detection of Social Interactions in Surveillance Video , 2012, ECCV Workshops.

[137]  Shaogang Gong,et al.  Video Behavior Profiling for Anomaly Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[138]  Dieter Fox,et al.  Fine-grained kitchen activity recognition using RGB-D , 2012, UbiComp.

[139]  Alex Pentland,et al.  Human computing and machine understanding of human behavior: a survey , 2006, ICMI '06.

[140]  Luc Van Gool,et al.  Coupled Action Recognition and Pose Estimation from Multiple Views , 2012, International Journal of Computer Vision.

[141]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[142]  Hai Jin,et al.  Integrating Spatio-Temporal Context With Multiview Representation for Object Recognition in Visual Surveillance , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[143]  S. Johnsen,et al.  Real-Time Object Tracking and Classification Using a Static Camera , 2009 .

[144]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[145]  Hironobu Fujiyoshi,et al.  Real-time human motion analysis by image skeletonization , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[146]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[147]  Björn Stenger,et al.  Correlated probabilistic trajectories for pedestrian motion detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[148]  Sheng-Wen Shih,et al.  Human Action Recognition Using 2-D Spatio-Temporal Templates , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[149]  Thomas B. Moeslund,et al.  A selective spatio-temporal interest point detector for human action recognition in complex scenes , 2011, 2011 International Conference on Computer Vision.

[150]  Mohan M. Trivedi,et al.  A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[151]  Xinbo Gao,et al.  Tactic analysis based on real-world ball trajectory in soccer video , 2012, Pattern Recognit..

[152]  Mohamed R. Amer,et al.  Cost-Sensitive Top-Down/Bottom-Up Inference for Multiscale Activity Recognition , 2012, ECCV.

[153]  Hema Swetha Koppula,et al.  Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..

[154]  Ivan Laptev,et al.  Improving bag-of-features action recognition with non-local cues , 2010, BMVC.

[155]  Martial Hebert,et al.  Event Detection in Crowded Videos , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[156]  Navneet Dalal,et al.  Finding People in Images and Videos , 2006 .

[157]  Ignacio Parra,et al.  Combination of Feature Extraction Methods for SVM Pedestrian Detection , 2007, IEEE Transactions on Intelligent Transportation Systems.

[158]  James W. Davis,et al.  Minimal-latency human action recognition using reliable-inference , 2006, Image Vis. Comput..

[159]  Ashley Tews,et al.  Real-Time Object Tracking and Classification Using a Static Came ra , 2009 .

[160]  Ian D. Reid,et al.  High Five: Recognising human interactions in TV shows , 2010, BMVC.

[161]  Tae-Kyun Kim,et al.  Tensor Canonical Correlation Analysis for Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[162]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[163]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[164]  W. Eric L. Grimson,et al.  Trajectory analysis and semantic region modeling using a nonparametric Bayesian model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[165]  Simon Richir,et al.  WiiMedia: motion analysis methods and applications using a consumer video game controller , 2007, Sandbox '07.

[166]  Mohammed Waleed Kadous,et al.  Temporal classification: extending the classification paradigm to multivariate time series , 2002 .

[167]  Marshall F. Tappen,et al.  Learning pedestrian dynamics from the real world , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[168]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[169]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[170]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[171]  Silvio Savarese,et al.  What are they doing? : Collective activity classification using spatio-temporal relationship among people , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[172]  Christophe Garcia,et al.  Human activities dataset and the ICPR 2012 human activities recognition and localization competition , 2012 .