Scalable and Flexible Unsupervised Feature Selection

Recently, graph-based unsupervised feature selection algorithms (GUFS) have been shown to efficiently handle prevalent high-dimensional unlabeled data. One common drawback associated with existing graph-based approaches is that they tend to be time-consuming and in need of large storage, especially when faced with the increasing size of data. Research has started using anchors to accelerate graph-based learning model for feature selection, while the hard linear constraint between the data matrix and the lower-dimensional representation is usually overstrict in many applications. In this letter, we propose a flexible linearization model with anchor graph and ℓ21-norm regularization, which can deal with large-scale data sets and improve the performance of the existing anchor-based method. In addition, the anchor-based graph Laplacian is constructed to characterize the manifold embedding structure by means of a parameter-free adaptive neighbor assignment strategy. An efficient iterative algorithm is developed to address the optimization problem, and we also prove the convergence of the algorithm. Experiments on several public data sets demonstrate the effectiveness and efficiency of the method we propose.

[1]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Rongrong Ji,et al.  Visual Reranking through Weakly Supervised Multi-graph Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Rongrong Ji,et al.  Weakly Supervised Multi-Graph Learning for Robust Image Reranking , 2014, IEEE Transactions on Multimedia.

[4]  Wei Jiang,et al.  Nonnegative matrix factorization by joint locality-constrained and ℓ2,1-norm regularization , 2018, Multimedia Tools and Applications.

[5]  P. Sonneveld,et al.  Nonnegative matrix factorization of a correlation matrix , 2009 .

[6]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[7]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[8]  Ivor W. Tsang,et al.  Flexible Manifold Embedding: A Framework for Semi-Supervised and Unsupervised Dimension Reduction , 2010, IEEE Transactions on Image Processing.

[9]  Xuelong Li,et al.  Unsupervised Feature Selection with Structured Graph Optimization , 2016, AAAI.

[10]  Petros Drineas,et al.  Feature Selection for Ridge Regression with Provable Guarantees , 2016, Neural Computation.

[11]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[12]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[13]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[14]  Josep M. Sopena,et al.  Performing Feature Selection With Multilayer Perceptrons , 2008, IEEE Transactions on Neural Networks.

[15]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[16]  Rong Wang,et al.  Robust 2DPCA With Non-greedy $\ell _{1}$ -Norm Maximization for Image Analysis , 2015, IEEE Transactions on Cybernetics.

[17]  Yihong Gong,et al.  A Weight-Adaptive Laplacian Embedding for Graph-Based Clustering , 2017, Neural Computation.

[18]  Chengqi Zhang,et al.  Convex Sparse PCA for Unsupervised Feature Learning , 2014, ACM Trans. Knowl. Discov. Data.

[19]  Wei Liu,et al.  Robust and Scalable Graph-Based Semisupervised Learning , 2012, Proceedings of the IEEE.

[20]  Wei Liu,et al.  Robust multi-class transductive learning with graphs , 2009, CVPR.

[21]  Wei Liu,et al.  Noise resistant graph ranking for improved web image search , 2011, CVPR 2011.

[22]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[23]  Rong Wang,et al.  Fast unsupervised feature selection with anchor graph and ℓ2,1-norm regularization , 2017, Multimedia Tools and Applications.

[24]  Feiping Nie,et al.  The Constrained Laplacian Rank Algorithm for Graph-Based Clustering , 2016, AAAI.

[25]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[26]  Feiping Nie,et al.  Nonlinear Dimensionality Reduction with Local Spline Embedding , 2009, IEEE Transactions on Knowledge and Data Engineering.

[27]  Wei Liu,et al.  Efficient Multi-Class Selective Sampling on Graphs , 2016, UAI.

[28]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[29]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[30]  Min Zhu,et al.  Tensor semantic model for an audio classification system , 2013, Science China Information Sciences.

[31]  Qiang Cheng,et al.  The Fisher-Markov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to High-Dimensional Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Qinghua Zheng,et al.  Adaptive Unsupervised Feature Selection With Structure Regularization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Wei Liu,et al.  Multi-label Learning with Missing Labels Using Mixed Dependency Graphs , 2018, International Journal of Computer Vision.

[34]  Dana Kulic,et al.  Feature-Selected Tree-Based Classification , 2013, IEEE Transactions on Cybernetics.

[35]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[36]  Feiping Nie,et al.  Clustering and projected clustering with adaptive neighbors , 2014, KDD.

[37]  Lei Shi,et al.  Robust Spectral Learning for Unsupervised Feature Selection , 2014, 2014 IEEE International Conference on Data Mining.

[38]  Feiping Nie,et al.  Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK) , 2014, ECML/PKDD.

[39]  Jason Weston,et al.  Large scale manifold transduction , 2008, ICML '08.

[40]  Feiping Nie,et al.  Multiview Feature Analysis via Structured Sparsity and Shared Subspace Discovery , 2017, Neural Computation.

[41]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[42]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.