Exploiting high level feature for dynamic textures recognition

In this paper, a novel framework is proposed for dynamic textures (DTs) recognition by learning a high level feature using deep neural network (DNN). The insight behind the method is that a DT appearing in different videos should share similar features, which can be learned for better recognition performance. Unlike many prior works only focus on low level or middle level features, we propose a novel high level feature learning method using DNN. Our goal is to construct a compact and discriminative semantic feature. The conventional bag of features approach using k-means is not semantically meaningful since the clustering criterion is based on appearance similarity. The proposed framework can effectively overcome the problem by capturing the semantic relations of the middle level by DNN. Extensive experiments with qualitative and quantitative results demonstrate the efficacy of our approach.

[1]  Payam Saisan,et al.  Dynamic texture recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  F. Takens Detecting strange attractors in turbulence , 1981 .

[3]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[4]  D. Chetverikov,et al.  Normal versus complete flow in dynamic texture recognition: a comparative study , 2005 .

[5]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[6]  Alain Fournier,et al.  A simple model of ocean waves , 1986, SIGGRAPH.

[7]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[8]  Fraser,et al.  Independent coordinates for strange attractors from mutual information. , 1986, Physical review. A, General physics.

[9]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Nuno Vasconcelos,et al.  Classifying Video with Kernel Dynamic Textures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Dmitry Chetverikov,et al.  Dynamic Texture Recognition Using Normal Flow and Texture Regularity , 2005, IbPRIA.

[13]  Chin-Hui Lee,et al.  Exploiting deep neural networks for detection-based speech recognition , 2013, Neurocomputing.

[14]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[17]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[18]  Geoffrey E. Hinton,et al.  On deep generative models with applications to recognition , 2011, CVPR 2011.

[19]  René Vidal,et al.  Categorizing Dynamic Textures Using a Bag of Dynamical Systems , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[21]  Martin Szummer,et al.  Temporal texture modeling , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[22]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[23]  Dani Lischinski,et al.  Texture Mixing and Texture Movie Synthesis Using Statistical Learning , 2001, IEEE Trans. Vis. Comput. Graph..

[24]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[25]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[26]  H. Abarbanel,et al.  Determining embedding dimension for phase-space reconstruction using a geometrical construction. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[27]  Alex Pentland,et al.  Fractal-Based Description of Natural Scenes , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Mubarak Shah,et al.  Chaotic Invariants for Human Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Yong Xu,et al.  Dynamic texture classification using dynamic fractal analysis , 2011, 2011 International Conference on Computer Vision.

[30]  Narendra Ahuja,et al.  Maximum Margin Distance Learning for Dynamic Texture Recognition , 2010, ECCV.

[31]  Alexander J. Smola,et al.  Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes , 2007, International Journal of Computer Vision.

[32]  Bidyut Baran Chaudhuri,et al.  Texture Segmentation Using Fractal Dimension , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.