Mutual-Information-Graph Regularized Sparse Transform for Unsupervised Feature Learning

Unsupervised feature learning is attracting more and more attention in machine learning and computer vision because of the increasing demand for effective representation of large-scale unlabeled data in real-world applications. This paper proposes a mutual-information-graph regularized sparse transform (MIST) algorithm by taking both of feature sparsity and underlying manifold structure of observation data into consideration. The feature transform is formulated by a transform kernel and a bias matrix. To obtain feature sparsity, the sparse filtering is utilized as nonlinear activation function. A mutual information graph is proposed to describe the underlying manifold structure of the observation data. The transform kernel and the bias matrix are finally learned under the regularization of the mutual information graph. The proposed approach has both the properties of sparsity and local-structure-preservation. These two properties guarantee the discriminative power and robustness in practical applications. Experimental results on handwritten digits recognition show that the proposed approach achieves high performance compared with existing unsupervised feature learning models.

[1]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[4]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[6]  Jie Chen,et al.  Cross-covariance regularized autoencoders for nonredundant sparse feature representation , 2018, Neurocomputing.

[7]  Ke Chen,et al.  Towards Understanding Sparse Filtering: A Theoretical Perspective , 2016, Neural Networks.

[8]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[9]  A. Maćkiewicz,et al.  Principal Components Analysis (PCA) , 1993 .

[10]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Henry Stark,et al.  Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[14]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[16]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[17]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[18]  Jiquan Ngiam,et al.  Sparse Filtering , 2011, NIPS.

[19]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[20]  Takeshi Ikenaga,et al.  AIGIF: Adaptively Integrated Gradient and Intensity Feature for Robust and Low-Dimensional Description of Local Keypoint , 2017, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[21]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[24]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[25]  Zhe Gan,et al.  Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.

[26]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[27]  Qi Tian,et al.  SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[30]  C.-C. Jay Kuo The CNN as a Guided Multilayer RECOS Transform [Lecture Notes] , 2017, IEEE Signal Processing Magazine.

[31]  Asghar Feizi,et al.  High-Level Feature Extraction for Classification and Person Re-Identification , 2017, IEEE Sensors Journal.

[32]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[33]  Xiaoyang Tan,et al.  Pattern Recognition , 2016, Communications in Computer and Information Science.

[34]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  C.-C. Jay Kuo Understanding convolutional neural networks with a mathematical model , 2016, J. Vis. Commun. Image Represent..

[36]  C.-C. Jay Kuo,et al.  On Data-Driven Saak Transform , 2017, J. Vis. Commun. Image Represent..