Graph Autoencoder-Based Unsupervised Feature Selection with Broad and Local Data Structure Preservation

Feature selection is a dimensionality reduction technique that selects a subset of representative features from high-dimensional data by eliminating irrelevant and redundant features. Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignores correlation between features. These works first map data onto a low-dimensional subspace and then select features by posing a sparsity constraint on the transformation matrix. However, they are restricted by design to linear data transformation, a potential drawback given that the underlying correlation structures of data are often non-linear. To leverage a more sophisticated embedding, we propose an autoencoder-based unsupervised feature selection approach that leverages a single-layer autoencoder for a joint framework of feature selection and manifold learning. More specifically, we enforce column sparsity on the weight matrix connecting the input layer and the hidden layer, as in previous work. Additionally, we include spectral graph analysis on the projected data into the learning process to achieve local data geometry preservation from the original data space to the low-dimensional feature space. Extensive experiments are conducted on image, audio, text, and biological data. The promising experimental results validate the superiority of the proposed method.

[1]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[2]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3]  Yangyang Li,et al.  Self-representation based dual-graph regularized feature selection clustering , 2016, Neurocomputing.

[4]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[5]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[6]  Witold Pedrycz,et al.  Global and local structure preserving sparse subspace learning: An iterative approach to unsupervised feature selection , 2015, Pattern Recognit..

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Xindong Wu,et al.  Feature selection using hierarchical feature clustering , 2011, CIKM '11.

[9]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[10]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[11]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[12]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[13]  Zili Zhang,et al.  Missing Value Estimation for Mixed-Attribute Data Sets , 2011, IEEE Transactions on Knowledge and Data Engineering.

[14]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[16]  Filiberto Pla,et al.  Supervised feature selection by clustering using conditional mutual information-based distances , 2010, Pattern Recognit..

[17]  Liang Du,et al.  Unsupervised Feature Selection with Adaptive Structure Learning , 2015, KDD.

[18]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[19]  David A. Landgrebe,et al.  Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[20]  Qinghua Hu,et al.  Non-convex regularized self-representation for unsupervised feature selection , 2015, Image Vis. Comput..

[21]  Yulong Wang,et al.  Sparse Coding From a Bayesian Perspective , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[23]  Feiping Nie,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Feature Selection via Joint Embedding Learning and Sparse Regression , 2022 .

[24]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[25]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[26]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[27]  ChengXiang Zhai,et al.  Robust Unsupervised Feature Selection , 2013, IJCAI.

[28]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[29]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[30]  Jing Liu,et al.  Robust Structured Subspace Learning for Data Representation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[32]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[33]  Xinbo Gao,et al.  Semi-Supervised Nonnegative Matrix Factorization via Constraint Propagation , 2016, IEEE Transactions on Cybernetics.

[34]  Rama Chellappa,et al.  Statistical analysis on Stiefel and Grassmann manifolds with applications in computer vision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[36]  Qinghua Hu,et al.  Subspace clustering guided unsupervised feature selection , 2017, Pattern Recognit..

[37]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[38]  Simon C. K. Shiu,et al.  Unsupervised feature selection by regularized self-representation , 2015, Pattern Recognit..

[39]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[40]  Xinbo Gao,et al.  Semantic Topic Multimodal Hashing for Cross-Media Retrieval , 2015, IJCAI.

[41]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[42]  Xinbo Gao,et al.  Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval , 2016, IEEE Transactions on Image Processing.

[43]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[44]  Giles M. Foody,et al.  Feature Selection for Classification of Hyperspectral Data by SVM , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[45]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[46]  Jing Liu,et al.  Image annotation using multi-correlation probabilistic matrix factorization , 2010, ACM Multimedia.

[47]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[48]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[49]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[50]  Jiawei Han,et al.  Joint Feature Selection and Subspace Learning , 2011, IJCAI.

[51]  Xinbo Gao,et al.  Semi-supervised constraints preserving hashing , 2015, Neurocomputing.

[52]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[53]  Xiaofeng Zhu,et al.  Graph self-representation method for unsupervised feature selection , 2017, Neurocomputing.