论文信息 - Rate-Invariant Recognition of Humans and Their Activities

Rate-Invariant Recognition of Humans and Their Activities

Pattern recognition in video is a challenging task because of the multitude of spatio-temporal variations that occur in different videos capturing the exact same event. While traditional pattern-theoretic approaches account for the spatial changes that occur due to lighting and pose, very little has been done to address the effect of temporal rate changes in the executions of an event. In this paper, we provide a systematic model-based approach to learn the nature of such temporal variations (time warps) while simultaneously allowing for the spatial variations in the descriptors. We illustrate our approach for the problem of action recognition and provide experimental justification for the importance of accounting for rate variations in action recognition. The model is composed of a nominal activity trajectory and a function space capturing the probability distribution of activity-specific time warping transformations. We use the square-root parameterization of time warps to derive geodesics, distance measures, and probability distributions on the space of time warping functions. We then design a Bayesian algorithm which treats the execution rate function as a nuisance variable and integrates it out using Monte Carlo sampling, to generate estimates of class posteriors. This approach allows us to learn the space of time warps for each activity while simultaneously capturing other intra- and interclass variations. Next, we discuss a special case of this approach which assumes a uniform distribution on the space of time warping functions and show how computationally efficient inference algorithms may be derived for this special case. We discuss the relative advantages and disadvantages of both approaches and show their efficacy using experiments on gait-based person identification and activity recognition.

[1] Rama Chellappa,et al. Unsupervised view and rate invariant clustering of video sequences q , 2009 .

[2] G. Johansson. Visual perception of biological motion and a model for its analysis , 1973 .

[3] Rama Chellappa,et al. Role of shape and kinematics in human movement analysis , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4] Yaser Sheikh,et al. On the use of anthropometry in the invariant analysis of human actions , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5] Aaron F. Bobick,et al. Performance Analysis of Time-Distance Gait Parameters under Different Speeds , 2003, AVBPA.

[6] Rama Chellappa,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[7] M. Irani,et al. Event-Based Video Analysis, , 2001 .

[8] Sudeep Sarkar,et al. Improved gait recognition by gait dynamics normalization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Dariu Gavrila,et al. The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[10] Rama Chellappa,et al. Towards a view invariant gait recognition algorithm , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[11] Rémi Ronfard,et al. Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[12] Aaron F. Bobick,et al. Learning visual behavior for gesture analysis , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[13] Sudeep Sarkar,et al. The humanID gait challenge problem: data sets, performance, and analysis , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Rama Chellappa,et al. Role of shape and kinematics in human movement analysis , 2004, CVPR 2004.

[15] Lihi Zelnik-Manor,et al. Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16] M. Shah,et al. Exploring the Space of an Action for Human Action Recognition , 2005 .

[17] Neil J. Gordon,et al. Editors: Sequential Monte Carlo Methods in Practice , 2001 .

[18] Mubarak Shah,et al. Motion-based recognition a survey , 1995, Image Vis. Comput..

[19] Ronen Basri,et al. Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Shaogang Gong,et al. Recognition of group activities using dynamic probabilistic networks , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21] Anuj Srivastava,et al. Statistical shape analysis: clustering, learning, and testing , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Jake K. Aggarwal,et al. Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[23] Ashok Veeraraghavan,et al. The Function Space of an Activity , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24] Guillermo Sapiro,et al. Dynamic Shapes Average , 2003 .

[25] Alex Pentland,et al. Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26] Anuj Srivastava,et al. Riemannian Analysis of Probability Density Functions with Applications in Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27] A. G. Amitha Perera,et al. Joint Recognition of Complex Events and Track Matching , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28] Eadweard Muybridge,et al. The Human Figure in Motion , 1955 .

[29] Dimitris N. Metaxas,et al. ASL recognition based on a coupling between HMMs and 3D motion analysis , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[30] Eamonn J. Keogh,et al. Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[31] Yaser Sheikh,et al. Exploring the space of a human action , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32] Timothy J. Robinson,et al. Sequential Monte Carlo Methods in Practice , 2003 .

[33] Rama Chellappa,et al. View invariants for human action recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[34] Rémi Ronfard,et al. Automatic Discovery of Action Taxonomies from Multiple Views , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35] H. Karcher. Riemannian center of mass and mollifier smoothing , 1977 .

[36] Sangho Park,et al. Recognition of two-person interactions using a hierarchical Bayesian network , 2003, IWVS '03.

[37] Vit Niennattrakul,et al. Making Image Retrieval and Classification More Accurate Using Time Series and Learned Constraints , 2009 .

[38] Rama Chellappa,et al. Identification of humans using gait , 2004, IEEE Transactions on Image Processing.

[39] Jitendra Malik,et al. Automatic Symbolic Traffic Scene Analysis Using Belief Networks , 1994, AAAI.

[40] Aaron F. Bobick,et al. A Framework for Recognizing Multi-Agent Action from Visual Evidence , 1999, AAAI/IAAI.

[41] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[42] Ieee Xplore,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43] Haiping Lu,et al. MPCA: Multilinear Principal Component Analysis of Tensor Objects , 2008, IEEE Transactions on Neural Networks.

[44] Bir Bhanu,et al. Individual recognition using gait energy image , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45] David J. Kriegman,et al. Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Anil K. Jain. Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[47] Ramakant Nevatia,et al. Large-scale event detection using semi-hidden Markov models , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[48] Ronen Basri,et al. Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[49] Mubarak Shah,et al. View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.