Online Group Feature Selection

Online feature selection with dynamic features has become an active research area in recent years. However, in some real-world applications such as image analysis and email spam filtering, features may arrive by groups. Existing online feature selection methods evaluate features individually, while existing group feature selection methods cannot handle online processing. Motivated by this, we formulate the online group feature selection problem, and propose a novel selection approach for this problem. Our proposed approach consists of two stages: online intra-group selection and online inter-group selection. In the intra-group selection, we use spectral analysis to select discriminative features in each group when it arrives. In the inter-group selection, we use Lasso to select a globally optimal subset of features. This 2-stage procedure continues until there are no more features to come or some predefined stopping conditions are met. Extensive experiments conducted on benchmark and real-world data sets demonstrate that our proposed approach outperforms other state-of-the-art online feature selection methods.

[1]  Rong Jin,et al.  Online feature selection for mining big data , 2012, BigMine '12.

[2]  Shuicheng Yan,et al.  Robust and Efficient Subspace Segmentation via Least Squares Regression , 2012, ECCV.

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Hansheng Wang,et al.  Computational Statistics and Data Analysis a Note on Adaptive Group Lasso , 2022 .

[5]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[6]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[7]  D. W. Zimmerman Teacher’s Corner: A Note on Interpretation of the Paired-Samples t Test , 1997 .

[8]  Hervé Glotin,et al.  Cooperative Sparse Representation in Two Opposite Directions for Semi-Supervised Image Annotation , 2012, IEEE Transactions on Image Processing.

[9]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[10]  J. Preston Ξ-filters , 1983 .

[11]  Mohamed S. Kamel,et al.  Efficient greedy feature selection for unsupervised learning , 2012, Knowledge and Information Systems.

[12]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[13]  Jing Zhou,et al.  Streaming feature selection using alpha-investing , 2005, KDD '05.

[14]  Ting Wang,et al.  Online active multi-field learning for efficient email spam filtering , 2011, Knowledge and Information Systems.

[15]  Hao Wang,et al.  Online Feature Selection with Streaming Features , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..