A review of image set classification

Abstract In computer vision, we generally solve a classification problem by a single image. With the video cameras being widely used in our real life, it is a nature choice to solve a classification problem by image sets. Compared with the single image based methods, the image set classification deals with severe changes of appearance and makes decisions by comparing the query set with gallery sets. So the image set classification offers more promises and has therefore attracted significant research attention in recent years. In this paper, we provide a review on image set classification. Our review begins with an overview of the direction of image set classification. Then we detail some classic algorithms. Experimental analyses are provided in corresponding subsection to compare classification performance of various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided as guidelines for future work.

[1]  Yicong Zhou,et al.  Pairwise Linear Regression Classification for Image Set Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Brian C. Lovell,et al.  Face Recognition from Still Images to Video Sequences: A Local-Feature-Based Framework , 2011, EURASIP J. Image Video Process..

[3]  Shiguang Shan,et al.  Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets , 2015, CVPR.

[4]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Zhengming Ma,et al.  Regularized constraint subspace based method for image set classification , 2018, Pattern Recognit..

[6]  Kun Zhou,et al.  Locality Sensitive Discriminant Analysis , 2007, IJCAI.

[7]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  De-Shuang Huang,et al.  Using FCMC, FVS, and PCA techniques for feature extraction of multispectral images , 2005, IEEE Geosci. Remote. Sens. Lett..

[9]  Mohammed Bennamoun,et al.  A semantic RBM-based model for image set classification , 2016, Neurocomputing.

[10]  J. S. Marron,et al.  Geometric representation of high dimension, low sample size data , 2005 .

[11]  Shimon Ullman,et al.  Face Recognition: The Problem of Compensating for Changes in Illumination Direction , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  De-Shuang Huang,et al.  Improved extreme learning machine for function approximation by encoding a priori information , 2006, Neurocomputing.

[13]  Simon C. K. Shiu,et al.  Gabor feature based robust representation and classification for face recognition with Gabor occlusion dictionary , 2013, Pattern Recognit..

[14]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[15]  Li Shang,et al.  Feature selection in independent component subspace for microarray data classification , 2006, Neurocomputing.

[16]  De-Shuang Huang,et al.  A Constructive Hybrid Structure Optimization Methodology for Radial Basis Probabilistic Neural Networks , 2008, IEEE Transactions on Neural Networks.

[17]  Li Shang,et al.  Noise removal using a novel non-negative sparse coding shrinkage technique , 2006, Neurocomputing.

[18]  Shiguang Shan,et al.  Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification , 2015, ICML.

[19]  Xiao-Yuan Jing,et al.  Discriminant Tensor Dictionary Learning with Neighbor Uncorrelation for Image Set Based Classification , 2017, IJCAI.

[20]  Osamu Yamaguchi,et al.  Face Recognition Using Multi-viewpoint Patterns for Robot Vision , 2003, ISRR.

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  De-Shuang Huang,et al.  An improved approximation approach incorporating particle swarm optimization and a priori information into neural networks , 2010, Neural Computing and Applications.

[23]  Mohammed Bennamoun,et al.  Deep Reconstruction Models for Image Set Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  David Zhang,et al.  From Point to Set: Extend the Learning of Distance Metrics , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Chao Wang,et al.  Supervised feature extraction based on orthogonal discriminant projection , 2009, Neurocomputing.

[26]  Ju-Chin Chen,et al.  Kernel discriminant transformation for image set-based face recognition , 2011, Pattern Recognit..

[27]  Zihan Zhou,et al.  Demo: Robust face recognition via sparse representation , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[28]  Shiguang Shan,et al.  Probabilistic nearest neighbor search for robust classification of face image sets , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[29]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[30]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[31]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[32]  Shiguang Shan,et al.  Prototype Discriminative Learning for Image Set Classification , 2017, IEEE Signal Processing Letters.

[33]  Gang Wang,et al.  Simultaneous Feature and Dictionary Learning for Image Set Based Face Recognition , 2014, ECCV.

[34]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[35]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[36]  De-Shuang Huang,et al.  Genetic Optimization Of Radial Basis Probabilistic Neural Networks , 2004, Int. J. Pattern Recognit. Artif. Intell..

[37]  Ralph Gross,et al.  The CMU Motion of Body (MoBo) Database , 2001 .

[38]  Tae-Kyun Kim,et al.  Learning over Sets using Boosted Manifold Principal Angles (BoMPA) , 2005, BMVC.

[39]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[40]  Rama Chellappa,et al.  Dictionary-Based Face Recognition from Video , 2012, ECCV.

[41]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[42]  Xiaofeng Wang,et al.  A Novel Multi-Layer Level Set Method for Image Segmentation , 2008, J. Univers. Comput. Sci..

[43]  Brian C. Lovell,et al.  Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[45]  Shiguang Shan,et al.  Prototype Discriminative Learning for Face Image Set Classification , 2016, ACCV.

[46]  Simon C. K. Shiu,et al.  Image Set-Based Collaborative Representation for Face Recognition , 2013, IEEE Transactions on Information Forensics and Security.

[47]  Xindong Wu,et al.  A set-level joint sparse representation for image set classification , 2018, Inf. Sci..

[48]  Zhengming Ma,et al.  Grassmann manifold for nearest points image set classification , 2015, Pattern Recognit. Lett..

[49]  Lior Wolf,et al.  Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..

[50]  Masashi Nishiyama,et al.  Recognizing Faces of Moving People by Hierarchical Image-Set Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[52]  Rama Chellappa,et al.  Kernel Learning for Extrinsic Classification of Manifold Features , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  De-Shuang Huang,et al.  Linear and Nonlinear Feedforward Neural Network Classifiers: A Comprehensive Understanding , 1999 .

[54]  Rama Chellappa,et al.  Statistical Computations on Grassmann and Stiefel Manifolds for Image and Video-Based Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Mohammad Reza Daliri,et al.  Robust symbolic representation for shape recognition and retrieval , 2008, Pattern Recognit..

[56]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[57]  Johannes Stallkamp,et al.  Video-based Face Recognition on Real-World Data , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[58]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[59]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[61]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[62]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Yang Zhao,et al.  Completed Local Binary Count for Rotation Invariant Texture Classification , 2012, IEEE Transactions on Image Processing.

[64]  Xiaofeng Wang,et al.  An efficient local Chan-Vese model for image segmentation , 2010, Pattern Recognit..

[65]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[66]  Shiguang Shan,et al.  Image sets alignment for Video-Based Face Recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Masashi Nishiyama,et al.  Face Recognition with the Multiple Constrained Mutual Subspace Method , 2003, AVBPA.

[68]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  David A. Landgrebe,et al.  Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[70]  Xiaofeng Wang,et al.  Classification of plant leaf images with complicated background , 2008, Appl. Math. Comput..

[71]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[72]  Rama Chellappa,et al.  Probabilistic Human Recognition from Video , 2002, ECCV.

[73]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[74]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[75]  Osamu Yamaguchi,et al.  The Kernel Orthogonal Mutual Subspace Method and Its Application to 3D Object Recognition , 2007, ACCV.

[76]  Mohammed Bennamoun,et al.  Iterative deep learning for image set based face and object recognition , 2016, Neurocomputing.

[77]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Bernard Ghanem,et al.  Representation learning with deep extreme learning machines for efficient image set classification , 2016, Neural Computing and Applications.

[79]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[80]  Nicolai Petkov,et al.  Distance sets for shape filters and shape recognition , 2003, IEEE Trans. Image Process..

[81]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[82]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[83]  Seungjin Choi,et al.  Restricted Deep Belief Networks for Multi-view Learning , 2011, ECML/PKDD.

[84]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[85]  Mohammed Bennamoun,et al.  Learning Non-linear Reconstruction Models for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[86]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[87]  Zhifeng Li,et al.  Spatio-temporal Embedding for Statistical Face Recognition from Video , 2006, ECCV.

[88]  Lei Zhang,et al.  Face recognition based on regularized nearest points between image sets , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[89]  Mubarak Shah,et al.  Tracking and Object Classification for Automated Surveillance , 2002, ECCV.

[90]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  De-shuang Huang,et al.  Computer-Aided Plant Species Identification (CAPSI) Based on Leaf Shape Matching Technique , 2006 .

[92]  D.-S. Huang,et al.  Radial Basis Probabilistic Neural Networks: Model and Application , 1999, Int. J. Pattern Recognit. Artif. Intell..

[93]  Hong Cheng,et al.  A set-to-set nearest neighbor approach for robust and efficient face recognition with image sets , 2018, J. Vis. Commun. Image Represent..

[94]  Mubarak Shah,et al.  Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[95]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.

[96]  Marian Stewart Bartlett,et al.  Face recognition by independent component analysis , 2002, IEEE Trans. Neural Networks.

[97]  Matti Pietikäinen,et al.  From still image to video-based face recognition: an experimental analysis , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[98]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[99]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[100]  Wei Jia,et al.  Palmprint recognition with 2DPCA+PCA based on modular neural networks , 2007, Neurocomputing.

[101]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[102]  Tsuhan Chen,et al.  Video-based face recognition using adaptive hidden Markov models , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[103]  Ken-ichi Maeda,et al.  Face recognition using temporal image sequence , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[104]  Jieping Ye,et al.  Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[105]  Josef Kittler,et al.  On-line Learning of Mutually Orthogonal Subspaces for Face Recognition by Image Sets , 2010, IEEE Transactions on Image Processing.

[106]  Pengfei Shi,et al.  Kernel Grassmannian distances and discriminant analysis for face recognition from image sets , 2009, Pattern Recognit. Lett..

[107]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[108]  Chao Wang,et al.  Feature extraction using constrained maximum variance mapping , 2008, Pattern Recognit..

[109]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[110]  Shiguang Shan,et al.  Discriminative Covariance Oriented Representation Learning for Face Recognition with Image Sets , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[111]  Mohammed Bennamoun,et al.  Linear Regression for Face Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[112]  Ajmal S. Mian,et al.  Image Set Based Face Recognition Using Self-Regularized Non-Negative Coding and Adaptive Distance Metric Learning , 2013, IEEE Transactions on Image Processing.

[113]  Michael R. Lyu,et al.  A novel adaptive sequential niche technique for multimodal function optimization , 2006, Neurocomputing.

[114]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[115]  De-Shuang Huang,et al.  A novel full structure optimization algorithm for radial basis probabilistic neural networks , 2006, Neurocomputing.

[116]  Hakan Cevikalp,et al.  Nearest hyperdisk methods for high-dimensional classification , 2008, ICML '08.

[117]  David J. Kriegman,et al.  Video-based face recognition using probabilistic appearance manifolds , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[118]  De-Shuang Huang,et al.  Locally linear discriminant embedding: An efficient method for face recognition , 2008, Pattern Recognit..

[119]  Ajmal S. Mian,et al.  Sparse approximated nearest points for image set classification , 2011, CVPR 2011.

[120]  Pascal Frossard,et al.  Dictionary learning: What is the right representation for my signal? , 2011 .

[121]  Xindong Wu,et al.  Image set classification based on cooperative sparse representation , 2017, Pattern Recognit..

[122]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[123]  De-Shuang Huang,et al.  The nearest-farthest subspace classification for face recognition , 2013, Neurocomputing.

[124]  Gang Wang,et al.  Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-kernel Metric Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[125]  De-Shuang Huang,et al.  A mended hybrid learning algorithm for radial basis function neural networks to improve generalization capability , 2007 .

[126]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, CVPR.

[127]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[128]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[129]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[130]  Thomas S. Huang,et al.  Pose-robust face recognition via sparse representation , 2013, Pattern Recognit..

[131]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[132]  Ajmal S. Mian,et al.  Face Recognition Using Sparse Approximated Nearest Points between Image Sets , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.