Dynamic Texture Comparison Using Derivative Sparse Representation: Application to Video-Based Face Recognition

Video-based face, expression, and scene recognition are fundamental problems in human–machine interaction, especially when there is a short-length video. In this paper, we present a new derivative sparse representation approach for face and texture recognition using short-length videos. First, it builds local linear subspaces of dynamic texture segments by computing spatiotemporal directional derivatives in a cylinder neighborhood within dynamic textures. Unlike traditional methods, a nonbinary texture coding technique is proposed to extract high-order derivatives using continuous circular and cylinder regions to avoid aliasing effects. Then, these local linear subspaces of texture segments are mapped onto a Grassmann manifold via sparse representation. A new joint sparse representation algorithm is developed to establish the correspondences of subspace points on the manifold for measuring the similarity between two dynamic textures. Extensive experiments on the Honda/UCSD, the CMU motion of body, the YouTube, and the DynTex datasets show that the proposed method consistently outperforms the state-of-the-art methods in dynamic texture recognition, and achieved the encouraging highest accuracy reported to date on the challenging YouTube face dataset. The encouraging experimental results show the effectiveness of the proposed method in video-based face recognition in human–machine system applications.

[1]  Shin'ichi Satoh,et al.  Comparative evaluation of face sequence matching for content-based video access , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[4]  Jie Chen,et al.  Theoretical Results on Sparse Representations of Multiple-Measurement Vectors , 2006, IEEE Transactions on Signal Processing.

[5]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[6]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  C. Schmid,et al.  Description of Interest Regions with Center-Symmetric Local Binary Patterns , 2006, ICVGIP.

[8]  Tsuhan Chen,et al.  Video-based face recognition using adaptive hidden Markov models , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Wen Gao,et al.  Manifold–Manifold Distance and its Application to Face Recognition With Image Sets , 2012, IEEE Transactions on Image Processing.

[10]  Nuno Vasconcelos,et al.  Classifying Video with Kernel Dynamic Textures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[12]  Masashi Nishiyama,et al.  Recognizing Faces of Moving People by Hierarchical Image-Set Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  K. Etemad,et al.  Discriminant analysis for recognition of human face images , 1997 .

[14]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Mohan M. Trivedi,et al.  Streaming face recognition using multicamera video arrays , 2002, Object recognition supported by user interaction for service robots.

[16]  Mark J. Huiskes,et al.  DynTex: A comprehensive database of dynamic textures , 2010, Pattern Recognit. Lett..

[17]  Yong Xu,et al.  Dynamic texture classification using dynamic fractal analysis , 2011, 2011 International Conference on Computer Vision.

[18]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[19]  Ajmal S. Mian,et al.  Face Recognition Using Sparse Approximated Nearest Points between Image Sets , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[21]  Matti Pietikäinen,et al.  Combining appearance and motion for face and gender recognition from videos , 2009, Pattern Recognit..

[22]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[23]  David J. Kriegman,et al.  Video-based face recognition using probabilistic appearance manifolds , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  J. Tropp Algorithms for simultaneous sparse approximation. Part II: Convex relaxation , 2006, Signal Process..

[25]  Joel A. Tropp,et al.  Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit , 2006, Signal Process..

[26]  Ajmal S. Mian,et al.  Online learning from local features for video-based face recognition , 2011, Pattern Recognit..

[27]  A. O'Toole,et al.  Recognizing moving faces: a psychological and neural synthesis , 2002, Trends in Cognitive Sciences.

[28]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[30]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Dit-Yan Yeung,et al.  Locally Linear Models on Face Appearance Manifolds with Application to Dual-Subspace Based Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[32]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[33]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[34]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[35]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[36]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Juyang Weng,et al.  Hierarchical Discriminant Regression , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Stefano Soatto,et al.  Dynamic Shape and Appearance Models , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Timothy F. Cootes,et al.  Learning to identify and track faces in image sequences , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[40]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[41]  Himanshu S. Bhatt,et al.  On Recognizing Faces in Videos Using Clustering-Based Re-Ranking and Fusion , 2014, IEEE Transactions on Information Forensics and Security.

[42]  Yaniv Taigman,et al.  Descriptor Based Methods in the Wild , 2008 .

[43]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Ralph Gross,et al.  The CMU Motion of Body (MoBo) Database , 2001 .

[45]  Trevor Darrell,et al.  Face Recognition from Long-Term Observations , 2002, ECCV.

[46]  Arun Ross,et al.  Face Recognition in Video: Adaptive Fusion of Multiple Matchers , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Rama Chellappa,et al.  Face reconstruction from monocular video using uncertainty analysis and a generic model , 2003, Comput. Vis. Image Underst..

[48]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Juyang Weng,et al.  An incremental learning method for face recognition under continuous video stream , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[50]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[51]  A. J. Shah,et al.  Image super resolution-A survey , 2012, 2012 1st International Conference on Emerging Technology Trends in Electronics, Communication & Networking.

[52]  Carlos Fernandez-Lozano,et al.  Texture analysis in gel electrophoresis images using an integrative kernel-based approach , 2016, Scientific Reports.

[53]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[54]  Jieping Ye,et al.  Efficient Recovery of Jointly Sparse Vectors , 2009, NIPS.

[55]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[56]  Ashfaqur Rahman,et al.  A Temporal Texture Characterization Technique Using Block-Based Approximated Motion Measure , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[57]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Rama Chellappa,et al.  Probabilistic recognition of human faces from video , 2002, Proceedings. International Conference on Image Processing.

[59]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[60]  Ramprasad Polana,et al.  Temporal texture and activity recognition , 1994 .