Learning probability distributions over partially-ordered human everyday activities

We propose a method to learn the partially-ordered structure inherent in human everyday activities from observations by exploiting variability in the data. Using statistical relational learning, the system extracts a full-joint probability distribution over the actions that form a task, their (partial) ordering, and their properties. Relevant action properties and relations among actions are learned as those that are consistent among the observations. The models can be used for classifying action sequences, for determining which actions are relevant for a task, which objects are usually manipulated, and which action properties are typical for a person. We evaluate the approach on synthetic data sampled from partial-order trees as well as two real-world data sets of humans activities: the TUM kitchen data set and the CMU MMAC data set. The results show that our approach outperforms sequence-based models like Conditional Random Fields for classifying observations of activities that allow a large amount of variation.

[1]  Sebastian Nowozin,et al.  Discriminative Subsequence Mining for Action Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Robert M. Fung,et al.  Backward Simulation in Bayesian Networks , 1994, UAI.

[3]  Manuela M. Veloso,et al.  Conditional random fields for activity recognition , 2007, AAMAS '07.

[4]  Henry A. Kautz,et al.  Fine-grained activity recognition by aggregating abstract object usage , 2005, Ninth IEEE International Symposium on Wearable Computers (ISWC'05).

[5]  Nir Friedman,et al.  Learning Bayesian Networks with Local Structure , 1996, UAI.

[6]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[7]  Daniel S. Weld,et al.  UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[8]  Henry A. Kautz,et al.  Generalized Plan Recognition , 1986, AAAI.

[9]  Robert P. Goldman,et al.  A New Model of Plan Recognition , 1999, UAI.

[10]  Michael Beetz,et al.  Bayesian Logic Networks , 2009 .

[11]  David C. Minnen,et al.  Propagation networks for recognition of partially ordered sequential action , 2004, CVPR 2004.

[12]  Irfan A. Essa,et al.  Structure from Statistics - Unsupervised Activity Analysis using Suffix Trees , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Moritz Tenorth,et al.  The TUM Kitchen Data Set of everyday manipulation activities for motion tracking and action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[14]  Larry S. Davis,et al.  Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos , 2009, CVPR.

[15]  Danica Kragic,et al.  Learning Task Models from Multiple Human Demonstrations , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[16]  Yiannis Aloimonos,et al.  A Language for Human Action , 2007, Computer.

[17]  Jessica K. Hodgins,et al.  Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database , 2008 .