Directed Acyclic Graph Kernels for Action Recognition

One of the trends of action recognition consists in extracting and comparing mid-level features which encode visual and motion aspects of objects into scenes. However, when scenes contain high-level semantic actions with many interacting parts, these mid-level features are not sufficient to capture high level structures as well as high order causal relationships between moving objects resulting into a clear drop in performances. In this paper, we address this issue and we propose an alternative action recognition method based on a novel graph kernel. In the main contributions of this work, we first describe actions in videos using directed a cyclic graphs (DAGs), that naturally encode pair wise interactions between moving object parts, and then we compare these DAGs by analyzing the spectrum of their sub-patterns that capture complex higher order interactions. This extraction and comparison process is computationally tractable, resulting from the a cyclic property of DAGs, and it also defines a positive semi-definite kernel. When plugging the latter into support vector machines, we obtain an action recognition algorithm that overtakes related work, including graph-based methods, on a standard evaluation dataset.

[1]  Adriana Kovashka,et al.  Learning a hierarchy of discriminative space-time neighborhood features for human action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Zaïd Harchaoui,et al.  Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[4]  Quoc V. Le,et al.  Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[5]  Mubarak Shah,et al.  Learning semantic features for action recognition via diffusion maps , 2012, Comput. Vis. Image Underst..

[6]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[7]  Mubarak Shah,et al.  Learning semantic visual vocabularies using diffusion distance , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  William Brendel,et al.  Learning spatiotemporal graphs of human activities , 2011, 2011 International Conference on Computer Vision.

[10]  Sinisa Todorovic Human Activities as Stochastic Kronecker Graphs , 2012, ECCV.

[11]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[12]  Ding-Zhu Du,et al.  A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering , 2003, J. Glob. Optim..

[13]  Andrew Gilbert,et al.  Action Recognition Using Mined Hierarchical Compound Features , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[15]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Hichem Sahbi,et al.  Mid-level features and spatio-temporal context for activity recognition , 2012, Pattern Recognit..

[17]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[18]  Cordelia Schmid,et al.  A time series kernel for action recognition , 2011, BMVC.

[19]  Yang Wang,et al.  Discriminative figure-centric models for joint action localization and recognition , 2011, 2011 International Conference on Computer Vision.

[20]  Iasonas Kokkinos,et al.  Discovering discriminative action parts from mid-level video representations , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[22]  Nils M. Kriege,et al.  Subgraph Matching Kernels for Attributed Graphs , 2012, ICML.

[23]  Jimeng Sun,et al.  Fast Random Walk Graph Kernel , 2012, SDM.

[24]  Francis R. Bach,et al.  Graph kernels between point clouds , 2007, ICML '08.

[25]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[26]  Deli Zhao,et al.  Graph Degree Linkage: Agglomerative Clustering on a Directed Graph , 2012, ECCV.

[27]  C. Schmid,et al.  Recognizing activities with cluster-trees of tracklets , 2012, BMVC.

[28]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.