论文信息 - Prediction of Human Activity by Discovering Temporal Sequence Patterns

Prediction of Human Activity by Discovering Temporal Sequence Patterns

Early prediction of ongoing human activity has become more valuable in a large variety of time-critical applications. To build an effective representation for prediction, human activities can be characterized by a complex temporal composition of constituent simple actions and interacting objects. Different from early detection on short-duration simple actions, we propose a novel framework for long -duration complex activity prediction by discovering three key aspects of activity: Causality, Context-cue, and Predictability. The major contributions of our work include: (1) a general framework is proposed to systematically address the problem of complex activity prediction by mining temporal sequence patterns; (2) probabilistic suffix tree (PST) is introduced to model causal relationships between constituent actions, where both large and small order Markov dependencies between action units are captured; (3) the context-cue, especially interactive objects information, is modeled through sequential pattern mining (SPM), where a series of action and object co-occurrence are encoded as a complex symbolic sequence; (4) we also present a predictive accumulative function (PAF) to depict the predictability of each kind of activity. The effectiveness of our approach is evaluated on two experimental scenarios with two data sets for each: action-only prediction and context-aware prediction. Our method achieves superior performance for predicting global activity classes and local action units.

Yun Fu | Kang Li | Y. Fu | Kang Li

[1] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[3] R. Agarwal. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4] Ramakrishnan Srikant,et al. Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[5] Aaron F. Bobick,et al. Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Jaideep Srivastava,et al. Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[7] Sourav S. Bhowmick,et al. Sequential Pattern Mining: A Survey , 2003 .

[8] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9] Kyoung-jae Kim,et al. Financial time series forecasting using support vector machines , 2003, Neurocomputing.

[10] Jeffrey Xu Yu,et al. Scalable sequential pattern mining for biological sequences , 2004, CIKM '04.

[11] Dana Ron,et al. The power of amnesia: Learning probabilistic automata with variable memory length , 1996, Machine Learning.

[12] Ran El-Yaniv,et al. On Prediction Using Variable Order Markov Models , 2004, J. Artif. Intell. Res..

[13] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[14] Andrew W. Moore,et al. A Bayesian Spatial Scan Statistic , 2005, NIPS.

[15] Sunita Sarawagi,et al. Sequence Data Mining , 2005 .

[16] Pier Luca Lanzi,et al. Mining interesting knowledge from weblogs: a survey , 2005, Data Knowl. Eng..

[17] Manuel Davy,et al. An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[18] Robert T. Collins,et al. An Open Source Tracking Testbed and Evaluation Web Site , 2005 .

[19] Jake K. Aggarwal,et al. Recognition of Composite Human Activities through Context-Free Grammar Based Representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20] Juan Carlos Niebles,et al. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[21] James W. Davis,et al. Minimal-latency human action recognition using reliable-inference , 2006, Image Vis. Comput..

[22] Rama Chellappa,et al. From Videos to Verbs: Mining Videos for Activities using a Cascade of Dynamical Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Peter Haider,et al. Supervised clustering of streaming data for email batch detection , 2007, ICML '07.

[24] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[25] Cordelia Schmid,et al. Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Siddhartha S. Srinivasa,et al. Planning-based prediction for pedestrians , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Chris L. Baker,et al. Action understanding as inverse planning , 2009, Cognition.

[29] Sharath Pankanti,et al. Recognition of repetitive sequential human activity , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Nicholas Roy,et al. Utilizing object-object and object-scene context when planning to find things , 2009, 2009 IEEE International Conference on Robotics and Automation.

[31] Dong Han,et al. Selection and context for action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32] Irfan A. Essa,et al. A novel sequence representation for unsupervised analysis of human activities , 2009, Artif. Intell..

[33] Martial Hebert,et al. Stacked Hierarchical Labeling , 2010, ECCV.

[34] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35] Juan Carlos Niebles,et al. Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.

[36] Paul Lukowicz,et al. Collecting complex activity datasets in highly rich networked sensor environments , 2010, 2010 Seventh International Conference on Networked Sensing Systems (INSS).

[37] Nazli Ikizler-Cinbis,et al. Object, Scene and Actions: Combining Multiple Features for Human Action Recognition , 2010, ECCV.

[38] Nizar R. Mabroukeh,et al. A taxonomy of sequential pattern mining algorithms , 2010, CSUR.

[39] Bohyung Han,et al. Scenario-based video event recognition by constraint flow , 2011, CVPR 2011.

[40] Cordelia Schmid,et al. Actom sequence models for efficient action detection , 2011, CVPR 2011.

[41] Yunde Jia,et al. Parsing video events with goal inference and intent prediction , 2011, 2011 International Conference on Computer Vision.

[42] Sergey Levine,et al. Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.

[43] Silvio Savarese,et al. Learning context for collective activity recognition , 2011, CVPR 2011.

[44] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.

[45] Michael S. Ryoo,et al. Human activity prediction: Early recognition of ongoing activities from streaming videos , 2011, 2011 International Conference on Computer Vision.

[46] Zhenguo Li,et al. Modeling Scene and Object Contexts for Human Action Retrieval With Few Examples , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[47] Alan Fern,et al. Probabilistic event logic for interval-based event recognition , 2011, CVPR 2011.

[48] William Brendel,et al. Learning spatiotemporal graphs of human activities , 2011, 2011 International Conference on Computer Vision.

[49] Benjamin Z. Yao,et al. Unsupervised learning of event AND-OR grammar and semantics from video , 2011, 2011 International Conference on Computer Vision.

[50] Martial Hebert,et al. Activity Forecasting , 2012, ECCV.

[51] Ying Wu,et al. Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[52] Bernt Schiele,et al. A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[53] Martial Hebert,et al. Co-inference for Multi-modal Scene Analysis , 2012, ECCV.

[54] Fernando De la Torre,et al. Max-Margin Early Event Detectors , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Yun Fu,et al. Modeling Complex Temporal Composition of Actionlets for Activity Prediction , 2012, ECCV.

[56] Sven J. Dickinson,et al. Recognize Human Activities from Partially Observed Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[57] S. Ramkumar. A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites , 2014 .