Multi-task Clustering of Human Actions by Sharing Information

Sharing information between multiple tasks can enhance the accuracy of human action recognition systems. However, using shared information to improve multi-task human action clustering has never been considered before, and cannot be achieved using existing clustering methods. In this work, we present a novel and effective Multi-Task Information Bottleneck (MTIB) clustering method, which is capable of exploring the shared information between multiple action clustering tasks to improve the performance of individual task. Our motivation is that, different action collections always share many similar action patterns, and exploiting the shared information can lead to improved performance. Specifically, MTIB generally formulates this problem as an information loss minimization function. In this function, the shared information can be quantified by the distributional correlation of clusters in different tasks, which is based on a high-level common vocabulary constructed through a novel agglomerative information maximization method. Extensive experiments on two kinds of challenging data sets, including realistic action data sets (HMDB & UCF50, Olympic & YouTube), and cross-view data sets (IXMAS, WVU), show that the proposed approach compares favorably to the state-of-the-art methods.

[1]  Juan Carlos Niebles,et al.  Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification , 2010, ECCV.

[2]  Xiao-Lei Zhang,et al.  Convex Discriminative Multitask Clustering , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Naftali Tishby,et al.  Multivariate Information Bottleneck , 2001, Neural Computation.

[4]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[5]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[6]  Mohan S. Kankanhalli,et al.  Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yangdong Ye,et al.  Unsupervised Human Action Categorization with Consensus Information Bottleneck Method , 2016, IJCAI.

[8]  Xianchao Zhang,et al.  Multi-Task Multi-View Clustering for Non-Negative Data , 2015, IJCAI.

[9]  Christoph H. Lampert,et al.  Curriculum learning of multiple tasks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xianchao Zhang,et al.  Self-Adapted Multi-Task Clustering , 2016, IJCAI.

[11]  Nicu Sebe,et al.  Egocentric Daily Activity Recognition via Multitask Clustering , 2015, IEEE Transactions on Image Processing.

[12]  Mubarak Shah,et al.  Discovering Motion Primitives for Unsupervised Grouping and One-Shot Learning of Human Actions, Gestures, and Expressions , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dacheng Tao,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Quanquan Gu,et al.  Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[15]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[16]  Chunfeng Yuan,et al.  Multi-task Sparse Learning with Beta Process Prior for Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Behrooz Mahasseni,et al.  Latent Multitask Learning for View-Invariant Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[19]  Subramanian Ramanathan,et al.  Multitask Linear Discriminant Analysis for View Invariant Action Recognition , 2014, IEEE Transactions on Image Processing.

[20]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[21]  Mubarak Shah,et al.  Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[23]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Nicu Sebe,et al.  Harnessing Lab Knowledge for Real-World Action Recognition , 2014, International Journal of Computer Vision.

[25]  Ling Shao,et al.  Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition , 2014, International Journal of Computer Vision.

[26]  Jianwen Zhang,et al.  Multitask Bregman clustering , 2010, Neurocomputing.

[27]  Vinodkrishnan Kulathumani,et al.  Real-time multi-view human action recognition using a wireless camera network , 2011, 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras.

[28]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[29]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[30]  Xianchao Zhang,et al.  Smart Multi-Task Bregman Clustering and Multi-Task Kernel Clustering , 2013, AAAI.

[31]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Ling Shao,et al.  Unsupervised Spectral Dual Assignment Clustering of Human Actions in Context , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Qiang Zhou,et al.  Learning to Share Latent Tasks for Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[35]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..