A discriminative multi-class feature selection method via weighted l2, 1-norm and Extended Elastic Net

Abstract Feature selection has playing an important role in many pattern recognition and machine learning applications, where meaningful features are desired to be extracted from high dimensional raw data and noisy ones are expected to be eliminated. l 2,1 -norm regularization based Robust Feature Selection (RFS) has extracted a lot of attention due to its efficiency and high performance of joint sparsity. In this paper, we propose a more general framework for robust and discriminative multi-class feature selection. Four types of weighting, which are based on correlation information between features and labels, are adopted to strengthen the discriminative performance of l 2,1 -norm joint sparsity. F-norm regularization, which is extended from multi-class Elastic Net, is added to improve the stability of the method. An efficient algorithm and its corresponding convergence proof are provided. Experimental results on several two-class and multi-class datasets are performed to verify the effectiveness of the proposed feature selection method.

[1]  Wei Jia,et al.  Locality preserving discriminant projections for face and palmprint recognition , 2010, Neurocomputing.

[2]  Qing Tian,et al.  Cross-heterogeneous-database age estimation through correlation representation learning , 2017, Neurocomputing.

[3]  Yong Luo,et al.  Group Sparse Multiview Patch Alignment Framework With View Consistency for Image Classification , 2014, IEEE Transactions on Image Processing.

[4]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[5]  Feiping Nie,et al.  Feature Selection at the Discrete Limit , 2014, AAAI.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Feiping Nie,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Feature Selection via Joint Embedding Learning and Sparse Regression , 2022 .

[10]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[11]  Shiguo Lian,et al.  Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics , 2017, Multimedia Tools and Applications.

[12]  Tieniu Tan,et al.  Feature Selection Based on Structured Sparsity: A Comprehensive Study , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[13]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Wei Jia,et al.  Discriminant sparse neighborhood preserving embedding for face recognition , 2012, Pattern Recognit..

[15]  Liang Yang,et al.  Computational promoter analysis of mouse, rat and human antimicrobial peptide-coding genes , 2006, BMC Bioinformatics.

[16]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[17]  Xingming Sun,et al.  Structural Minimax Probability Machine , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Feiping Nie,et al.  Effective Discriminative Feature Selection With Nontrivial Solution , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Jie Gui,et al.  A novel method for recognizing face with partial occlusion via sparse representation , 2013 .

[20]  Sam Kwong,et al.  Efficient Motion and Disparity Estimation Optimization for Low Complexity Multiview Video Coding , 2015, IEEE Transactions on Broadcasting.

[21]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[24]  Enhua Wu,et al.  Robust dense reconstruction by range merging based on confidence estimation , 2016, Science China Information Sciences.

[25]  I. Glad,et al.  Weighted Lasso with Data Integration , 2011, Statistical applications in genetics and molecular biology.

[26]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[27]  Ying-Ke Lei,et al.  Face recognition via Weighted Sparse Representation , 2013, J. Vis. Commun. Image Represent..

[28]  Feiping Nie,et al.  Feature Selection via Global Redundancy Minimization , 2015, IEEE Transactions on Knowledge and Data Engineering.

[29]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[30]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[31]  Yong Luo,et al.  Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification , 2019, IEEE Transactions on Image Processing.

[32]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[33]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[34]  Jianzhong Li,et al.  A stable gene selection in microarray data analysis , 2006, BMC Bioinformatics.

[35]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[36]  Naixue Xiong,et al.  Steganalysis of LSB matching using differences between nonadjacent pixels , 2016, Multimedia Tools and Applications.

[37]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[38]  Xingming Sun,et al.  Fast Motion Estimation Based on Content Property for Low-Complexity H.265/HEVC Encoder , 2016, IEEE Transactions on Broadcasting.

[39]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Xindong Wu,et al.  How to Estimate the Regularization Parameter for Spectral Regression Discriminant Analysis and its Kernel Version? , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[41]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[42]  Jie Gui,et al.  Multi-step dimensionality reduction and semi-supervised graph-based tumor classification using gene expression data , 2010, Artif. Intell. Medicine.

[43]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[44]  Tieniu Tan,et al.  Representative Vector Machines: A Unified Framework for Classical Classifiers , 2016, IEEE Transactions on Cybernetics.

[45]  Chengsheng Yuan,et al.  Fingerprint liveness detection based on multi-scale LPQ and PCA , 2016, China Communications.

[46]  Peng Jin,et al.  Fast reference frame selection based on content similarity for low complexity HEVC encoder , 2016, J. Vis. Commun. Image Represent..