A Transfer Learning based Feature-Weak-Relevant Method for Image Clustering

Image clustering is to group a set of images into disjoint clusters in a way that images in the same cluster are more similar to each other than to those in other clusters, which is an unsupervised or semi-supervised learning process. It is a crucial and challenging task in machine learning and computer vision. The performances of existing image clustering methods have close relations with features used for clustering, even if unsupervised coding based methods have improved the performances a lot. To reduce the effect of clustering features, we propose a feature-weak-relevant method for image clustering. The proposed method converts an unsupervised clustering process into an alternative iterative process of unsupervised learning and transfer learning. The clustering process firstly starts up from handcrafted features based image clustering to estimate an initial label for every image, and secondly use a proposed sampling strategy to choose images with reliable labels to feed a transfer-learning model to learn representative features that can be used for next round of unsupervised learning. In this manner, image clustering is iteratively optimized. What's more, the handcrafted features are used to boot up the clustering process, and just have a little effect on the final performance; therefore, the proposed method is feature-weak-relevant. Experimental results on six kinds of public available datasets show that the proposed method outperforms state of the art methods and depends less on the employed features at the same time.

[1]  René Vidal,et al.  Latent Space Sparse Subspace Clustering , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Jiawei Han,et al.  Locally Consistent Concept Factorization for Document Clustering , 2011, IEEE Transactions on Knowledge and Data Engineering.

[3]  Furui Liu,et al.  Unsupervised Feature Selection for Multi-cluster Data via Smooth Distributed Score , 2012, ICIC.

[4]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[5]  Bo Yang,et al.  Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering , 2016, IEEE Transactions on Signal Processing.

[6]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[7]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[8]  Fei-Fei Li,et al.  Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs , 2012 .

[9]  Ka Yee Yeung,et al.  Details of the Adjusted Rand index and Clustering algorithms Supplement to the paper “ An empirical study on Principal Component Analysis for clustering gene expression data ” ( to appear in Bioinformatics ) , 2001 .

[10]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[11]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[12]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[13]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Yaochu Jin,et al.  Feature selection for high-dimensional classification using a competitive swarm optimizer , 2016, Soft Computing.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[18]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[19]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[20]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[21]  Pavel Pudil,et al.  Novel Methods for Feature Subset Selection with Respect to Problem Knowledge , 1998 .

[22]  Bo Yang,et al.  Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering , 2016, ICML.

[23]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[24]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Haiping Lu,et al.  A survey of multilinear subspace learning for tensor data , 2011, Pattern Recognit..

[26]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[29]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[30]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[31]  Wilson K. Cheruiyot,et al.  A Survey and Comparative Study of Filter and Wrapper Feature Selection Techniques , 2016 .

[32]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[33]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[34]  Yoshua Bengio,et al.  Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models , 1993, NIPS.

[35]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[37]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[38]  Anil K. Jain,et al.  Image Restoration, Modelling, and Reduction of Dimensionality , 1974, IEEE Transactions on Computers.

[39]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[41]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Dong-Chen He,et al.  Texture Unit, Texture Spectrum And Texture Analysis , 1989, 12th Canadian Symposium on Remote Sensing Geoscience and Remote Sensing Symposium,.

[43]  P. Jonathon Phillips,et al.  An Introduction to Evaluating Biometric Systems , 2000, Computer.

[44]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[45]  Dong-Chen He,et al.  Texture Unit, Texture Spectrum, And Texture Analysis , 1990 .

[46]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[48]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[49]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.