Alignment Distances on Systems of Bags

Recent research in image and video recognition indicates that many visual processes can be thought of as being generated by a time-varying generative model. A nearby descriptive model for visual processes is thus a statistical distribution that varies over time. Specifically, modeling visual processes as streams of histograms generated by a kernelized linear dynamic system turns out to be efficient. We refer to such a model as a system of bags. In this paper, we investigate systems of bags with special emphasis on dynamic scenes and dynamic textures. Parameters of linear dynamic systems suffer from ambiguities. In order to cope with these ambiguities in the kernelized setting, we develop a kernelized version of the alignment distance. For its computation, we use a Jacobi-type method and prove its convergence to a set of critical points. We employ it as a dissimilarity measure on Systems of Bags. As such, it outperforms other known dissimilarity measures for kernelized linear dynamic systems, in particular the Martin distance and the Maximum singular value distance, in every tested classification setting. A considerable margin can be observed in settings, where classification is performed with respect to an abstract mean of video sets. For this scenario, the presented approach can outperform the state-of-the-art techniques, such as dynamic fractal spectrum or orthogonal tensor dictionary learning.

[1]  Michel Ménard,et al.  Characterization and recognition of dynamic textures based on the 2D+T curvelet transform , 2015, Signal Image Video Process..

[2]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[3]  Nuno Vasconcelos,et al.  Classifying Video with Kernel Dynamic Textures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Hyun Seung Yang,et al.  Not all frames are equal: aggregating salient features for dynamic texture classification , 2018, Multidimens. Syst. Signal Process..

[5]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Mark J. Huiskes,et al.  DynTex: A comprehensive database of dynamic textures , 2010, Pattern Recognit. Lett..

[7]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[8]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[9]  Bart De Moor,et al.  Subspace angles between ARMA models , 2002, Syst. Control. Lett..

[10]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[11]  Uwe Helmke,et al.  Jacobi's Algorithm on Compact Lie Algebras , 2004, SIAM J. Matrix Anal. Appl..

[12]  Michael Unser,et al.  Sum and Difference Histograms for Texture Classification , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yan Huang,et al.  Dynamic Texture Recognition via Orthogonal Tensor Dictionary Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Francesca Odone,et al.  Histogram intersection kernel for image classification , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[15]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  C. V. Jawahar,et al.  Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Matti Pietikäinen,et al.  Local Binary Patterns , 2010, Scholarpedia.

[18]  Antoni B. Chan,et al.  A Scalable and Accurate Descriptor for Dynamic Textures Using Bag of System Trees , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[20]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[21]  Richard P. Wildes,et al.  Bags of Spacetime Energies for Dynamic Scene Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[23]  René Vidal,et al.  Group action induced distances for averaging and clustering Linear Dynamical Systems with applications to the analysis of dynamic scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Matti Pietikäinen,et al.  Dynamic texture and scene classification by transferring deep image features , 2015, Neurocomputing.

[25]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  René Vidal,et al.  Categorizing Dynamic Textures Using a Bag of Dynamical Systems , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Ruzena Bajcsy,et al.  Sequence of the Most Informative Joints (SMIJ): A new representation for human skeletal action recognition , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  Stéphane Mallat,et al.  Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[29]  Saeid Motiian,et al.  Pairwise Kernels for Human Interaction Recognition , 2013, ISVC.

[30]  René Vidal,et al.  Fast Jacobi-type algorithm for computing distances between linear dynamical systems , 2013, 2013 European Control Conference (ECC).

[31]  M. Fréchet Les éléments aléatoires de nature quelconque dans un espace distancié , 1948 .

[32]  Yong Xu,et al.  Dynamic texture classification using dynamic fractal analysis , 2011, 2011 International Conference on Computer Vision.

[33]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[34]  Hui Ji,et al.  Equiangular Kernel Dictionary Learning with Applications to Dynamic Texture Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Richard P. Wildes,et al.  Dynamic scene understanding: The role of orientation features in space and time in scene classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[37]  Dewen Hu,et al.  Scene classification using a multi-resolution bag-of-features model , 2013, Pattern Recognit..

[38]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.