Online Feature Selection with Group Structure Analysis

Online selection of dynamic features has attracted intensive interest in recent years. However, existing online feature selection methods evaluate features individually and ignore the underlying structure of a feature stream. For instance, in image analysis, features are generated in groups which represent color, texture, and other visual information. Simply breaking the group structure in feature selection may degrade performance. Motivated by this observation, we formulate the problem as an online group feature selection. The problem assumes that features are generated individually but there are group structures in the feature stream. To the best of our knowledge, this is the first time that the correlation among streaming features has been considered in the online feature selection process. To solve this problem, we develop a novel online group feature selection method named OGFS. Our proposed approach consists of two stages: online intra-group selection and online inter-group selection. In the intra-group selection, we design a criterion based on spectral analysis to select discriminative features in each group. In the inter-group selection, we utilize a linear regression model to select an optimal subset. This two-stage procedure continues until there are no more features arriving or some predefined stopping conditions are met. Finally, we apply our method to multiple tasks including image classification and face verification. Extensive empirical studies performed on real-world and benchmark data sets demonstrate that our method outperforms other state-of-the-art online feature selection methods.

[1]  Venu Govindaraju,et al.  Parallel Feature Selection Inspired by Group Testing , 2014, NIPS.

[2]  L. Eon Bottou Online Learning and Stochastic Approximations , 1998 .

[3]  Meng Wang,et al.  Multimodal Graph-Based Reranking for Web Image Search , 2012, IEEE Transactions on Image Processing.

[4]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[5]  Jieping Ye,et al.  Efficient Sparse Group Feature Selection via Nonconvex Optimization , 2012, ICML.

[6]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  José Ranilla,et al.  Introducing a family of linear measures for feature selection in text categorization , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2007 .

[10]  Jieping Ye,et al.  Two-Layer Feature Reduction for Sparse-Group Lasso via Decomposition of Convex Sets , 2014, NIPS.

[11]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[12]  Hao Wang,et al.  Online Streaming Feature Selection , 2010, ICML.

[13]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[14]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[15]  Jia Chen,et al.  Unified Structured Learning for Simultaneous Human Pose Estimation and Garment Attribute Classification , 2014, IEEE Transactions on Image Processing.

[16]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[17]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[18]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[19]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[20]  Jing Zhou,et al.  Streaming feature selection using alpha-investing , 2005, KDD '05.

[21]  Rong Jin,et al.  Online Feature Selection and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.

[22]  Hao Wang,et al.  Online Feature Selection with Streaming Features , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Léon Bottou,et al.  On-line learning and stochastic approximations , 1999 .

[24]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[25]  Hervé Glotin,et al.  Cooperative Sparse Representation in Two Opposite Directions for Semi-Supervised Image Annotation , 2012, IEEE Transactions on Image Processing.

[26]  Jieping Ye,et al.  Simultaneous feature and feature group selection through hard thresholding , 2014, KDD.

[27]  Jing Wang,et al.  Online Group Feature Selection , 2013, IJCAI.

[28]  Ji Wan,et al.  SOML: Sparse Online Metric Learning with Application to Image Retrieval , 2014, AAAI.

[29]  Carolyn Pillers Dobler Mathematical Statistics: Basic Ideas and Selected Topics , 2002 .

[30]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[31]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[32]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[33]  James Theiler,et al.  Online feature selection for pixel classification , 2005, ICML.

[34]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[35]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[36]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[38]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[39]  Yue Gao,et al.  View-Based Discriminative Probabilistic Modeling for 3D Object Retrieval and Recognition , 2013, IEEE Transactions on Image Processing.

[40]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[41]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[42]  Kilian Q. Weinberger,et al.  Gradient boosted feature selection , 2014, KDD.

[43]  Aapo Hyvärinen,et al.  The Independence Assumption: Analyzing the Independence of the Components by Topography , 2000 .

[44]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[45]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[46]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[47]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[48]  Zenglin Xu,et al.  Online Learning for Group Lasso , 2010, ICML.

[49]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[50]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[51]  Bingbing Ni,et al.  Assistive tagging: A survey of multimedia tagging with human-computer joint exploration , 2012, CSUR.

[52]  Mohamed S. Kamel,et al.  Efficient greedy feature selection for unsupervised learning , 2012, Knowledge and Information Systems.

[53]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.