UDSFS: Unsupervised deep sparse feature selection

Abstract In this paper, we focus on unsupervised feature selection. As we have known, the combination of several feature units into a whole feature vector is broadly adopted for effective object representation, which may inevitably includes some irrelevant/redundant feature units or feature dimensions. Most of the traditional feature selection models can only select the feature dimensions without concerning the intrinsic relationship among different feature units. By taking into consideration the group sparsity of feature dimensions and feature units based on an l 2 , 1 minimization, we propose a new unsupervised feature selection model, unsupervised deep sparse feature selection (UDSFS) in this paper. In comparison with the state-of-the-arts, our UDSFS model can not only select the most discriminative feature units but also assign proper weight to the useful feature dimensions concurrently; moreover, the efficiency and robustness of our UDSFS can be also improved without extracting the discarded irrelevant feature units. For model optimization, we introduce an efficient iterative algorithm to solve the non-smooth, convex model and obtain a global optimization with the convergence rate as O ( 1 / K 2 ) (K is the iteration number). For the experiments, a new medical endoscopic image dataset, Abnormal Endoscopic Image Detection dataset (AEID), is built for evaluation; we also test our model using two public UCI datasets. Various experiments and comparisons with other state-of-the-arts justified the effectiveness and efficiency of our UDSFS model.

[1]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[2]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[3]  Jiye Liang,et al.  Ieee Transactions on Knowledge and Data Engineering 1 a Group Incremental Approach to Feature Selection Applying Rough Set Technique , 2022 .

[4]  Nicu Sebe,et al.  Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks , 2013, IEEE Transactions on Multimedia.

[5]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[6]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[7]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[8]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[9]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[10]  Frederico Gualberto F. Coelho,et al.  Semi-supervised feature selection , 2013 .

[11]  Carla E. Brodley,et al.  Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[13]  Huan Liu,et al.  Unsupervised feature selection for linked social media data , 2012, KDD.

[14]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[16]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[17]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[18]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[19]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[20]  Filiberto Pla,et al.  Supervised feature selection by clustering using conditional mutual information-based distances , 2010, Pattern Recognit..

[21]  Jiwen Lu,et al.  Coupled Discriminative Feature Learning for Heterogeneous Face Recognition , 2015, IEEE Transactions on Information Forensics and Security.

[22]  Shannon L. Risacher,et al.  Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning , 2012, Bioinform..

[23]  Yixin Chen,et al.  Efficient ant colony optimization for image feature selection , 2013, Signal Process..

[24]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face Verification in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Y. Shin,et al.  Generalized Impulse Response Analysis in Linear Multivariate Models , 1998 .

[26]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[27]  Manuel Graña,et al.  Evolutionary ELM wrapper feature selection for Alzheimer's disease CAD on anatomical brain MRI , 2014, Neurocomputing.

[28]  Jiye Liang,et al.  International Journal of Approximate Reasoning an Efficient Rough Feature Selection Algorithm with a Multi-granulation View , 2022 .

[29]  Yogesh R. Shepal A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data , 2014 .

[30]  Feiping Nie,et al.  Heterogeneous Visual Features Fusion via Sparse Multimodal Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Julien Mairal,et al.  Supervised feature selection in graphs with path coding penalties and network flows , 2012, J. Mach. Learn. Res..

[32]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Jidong Zhao,et al.  Locality sensitive semi-supervised feature selection , 2008, Neurocomputing.

[34]  Heng-Da Cheng,et al.  Computer-aided detection and classification of microcalcifications in mammograms: a survey , 2003, Pattern Recognit..

[35]  Jennifer G. Dy Unsupervised Feature Selection , 2007 .

[36]  Jiebo Luo,et al.  Deep sparse feature selection for computer aided endoscopy diagnosis , 2015, Pattern Recognit..

[37]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[38]  Jiwen Lu,et al.  Multi-feature multi-manifold learning for single-sample face recognition , 2014, Neurocomputing.

[39]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[41]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[42]  Huan Liu,et al.  Feature selection for classification: A review , 2014 .

[43]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[44]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[45]  Jiebo Luo,et al.  Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection , 2012, IEEE Transactions on Multimedia.

[46]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[47]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[48]  Yi Yang,et al.  Semi-Supervised Multiple Feature Analysis for Action Recognition , 2014, IEEE Transactions on Multimedia.

[49]  Jiwen Lu,et al.  Deep transfer metric learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[51]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[52]  Huan Liu,et al.  Feature Selection for Clustering: A Review , 2018, Data Clustering: Algorithms and Applications.

[53]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[54]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[55]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[56]  Nicu Sebe,et al.  Web Image Annotation Via Subspace-Sparsity Collaborated Feature Selection , 2012, IEEE Transactions on Multimedia.

[57]  Zhiyong Zeng,et al.  Feature Selection Based on Dependency Margin , 2015, IEEE Transactions on Cybernetics.