Robust unsupervised feature selection via dual self-representation and manifold regularization

Abstract Unsupervised feature selection has become an important and challenging pre-processing step in machine learning and data mining since large amount of unlabelled high dimensional data are often required to be processed. In this paper, we propose an efficient method for robust unsupervised feature selection via dual self-representation and manifold regularization, referred to as DSRMR briefly. On the one hand, a feature self-representation term is used to learn the feature representation coefficient matrix to measure the importance of different feature dimensions. On the other hand, a sample self-representation term is used to automatically learn the sample similarity graph to preserve the local geometrical structure of data which has been verified critical in unsupervised feature selection. By using l2,1-norm to regularize the feature representation residual matrix and representation coefficient matrix, our method is robustness to outliers, and the row sparsity of the feature coefficient matrix induced by l2,1-norm can effectively select representative features. During the optimization process, the feature coefficient matrix and sample similarity graph constrain each other to obtain optimal solution. Experimental results on ten real-world data sets demonstrate that the proposed method can effectively identify important features, outperforming many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy (ACC) and normalized mutual information (NMI).

[1]  Yong Shi,et al.  Laplacian twin support vector machine for semi-supervised classification , 2012, Neural Networks.

[2]  Chang Tang,et al.  Gene selection for microarray data classification via subspace learning and manifold regularization , 2017, Medical & Biological Engineering & Computing.

[3]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Qinghua Zheng,et al.  Adaptive Unsupervised Feature Selection With Structure Regularization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[6]  Yangyang Li,et al.  Self-representation based dual-graph regularized feature selection clustering , 2016, Neurocomputing.

[7]  Han Wang,et al.  Unsupervised feature selection via low-rank approximation and structure learning , 2017, Knowl. Based Syst..

[8]  Yitian Xu,et al.  A safe screening rule for Laplacian support vector machine , 2018, Eng. Appl. Artif. Intell..

[9]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Jie Tian,et al.  Robust graph regularized unsupervised feature selection , 2018, Expert Syst. Appl..

[11]  Huan Liu,et al.  Embedded Unsupervised Feature Selection , 2015, AAAI.

[12]  Xiao Wang,et al.  Unsupervised feature selection via Diversity-induced Self-representation , 2017, Neurocomputing.

[13]  Xiangtao Zheng,et al.  Discovering Diverse Subset for Unsupervised Hyperspectral Band Selection , 2017, IEEE Transactions on Image Processing.

[14]  Dongmei Zhang,et al.  Nonparametrically Guided Autoencoder with Laplace Approximation for dimensionality reduction , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[15]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Khalid Benabdeslem,et al.  Efficient Semi-Supervised Feature Selection: Constraint, Relevance, and Redundancy , 2014, IEEE Transactions on Knowledge and Data Engineering.

[17]  Xuelong Li,et al.  Unsupervised Feature Selection with Structured Graph Optimization , 2016, AAAI.

[18]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[19]  Qinghua Hu,et al.  Non-convex regularized self-representation for unsupervised feature selection , 2015, Image Vis. Comput..

[20]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[21]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[22]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[23]  Xiaofeng Zhu,et al.  Unsupervised feature selection for visual classification via feature-representation property , 2017, Neurocomputing.

[24]  Daming Shi,et al.  TPSLVM: A Dimensionality Reduction Algorithm Based On Thin Plate Splines , 2014, IEEE Transactions on Cybernetics.

[25]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[26]  Yueting Zhuang,et al.  Graph Regularized Feature Selection with Data Reconstruction , 2016, IEEE Transactions on Knowledge and Data Engineering.

[27]  Yitian Xu,et al.  Laplacian twin parametric-margin support vector machine for semi-supervised classification , 2016, Neurocomputing.

[28]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[29]  Witold Pedrycz,et al.  Global and local structure preserving sparse subspace learning: An iterative approach to unsupervised feature selection , 2015, Pattern Recognit..

[30]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[31]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[32]  Lei Wang,et al.  Global and Local Structure Preservation for Feature Selection , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Simon C. K. Shiu,et al.  Unsupervised feature selection by regularized self-representation , 2015, Pattern Recognit..

[34]  Aleksandra Pizurica,et al.  Semisupervised Local Discriminant Analysis for Feature Extraction in Hyperspectral Images , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[35]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[36]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[37]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[38]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[39]  Jinhui Tang,et al.  Unsupervised Feature Selection via Nonnegative Spectral Analysis and Redundancy Control , 2015, IEEE Transactions on Image Processing.

[40]  Xiaofeng Zhu,et al.  Graph self-representation method for unsupervised feature selection , 2017, Neurocomputing.

[41]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..