Feature Selection Under a Complexity Constraint

Classification on mobile devices is often done in an uninterrupted fashion. This requires algorithms with gentle demands on the computational complexity. The performance of a classifier depends heavily on the set of features used as input variables. Existing feature selection strategies for classification aim at finding a ldquobestrdquo set of features that performs well in terms of classification accuracy, but are not designed to handle constraints on the computational complexity. We demonstrate that an extension of the performance measures used in state-of-the-art feature selection algorithms with a penalty on the feature extraction complexity leads to superior feature sets if the allowed computational complexity is limited. Our solution is independent of a particular classification algorithm.

[1]  Christopher Miller,et al.  Olfoto: designing a smell-based interaction , 2006, CHI.

[2]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[3]  Saho Ayabe-Kanamura,et al.  Development of a smell identification test using a novel stick-type odor presentation kit. , 2006, Chemical senses.

[4]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[5]  Jan M. Van Campenhout,et al.  On the Possible Orderings in the Measurement Selection Problem , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Ralf Steinmetz,et al.  A Media Synchronization Survey: Reference Model, Specification, and Case Studies , 1996, IEEE J. Sel. Areas Commun..

[7]  J. Kittler,et al.  Multistage pattern recognition with reject option , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[10]  A. Murat Tekalp,et al.  Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.

[11]  Kuncup Iswandy,et al.  Feature selection with acquisition cost for optimizing sensor system design , 2006 .

[12]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[13]  Jan Skoglund,et al.  Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[14]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[15]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[16]  W. Bastiaan Kleijn,et al.  Gaussian mixture model based mutual information estimation between frequency bands in speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Geert M. P. van Kempen,et al.  On Feature Selection with Measurement Cost and Grouped Features , 2002, SSPR/SPR.

[19]  W. Bastiaan Kleijn,et al.  On the Estimation of Differential Entropy From Data Located on Embedded Manifolds , 2007, IEEE Transactions on Information Theory.

[20]  Lawrence Carin,et al.  Cost-sensitive feature acquisition and classification , 2007, Pattern Recognit..

[21]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Andrew M. Fraser,et al.  Information and entropy in strange attractors , 1989, IEEE Trans. Inf. Theory.

[23]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[24]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  J.H. Plasberg,et al.  Complexity-Constrained Feature Selection for Classification , 2007, 2007 Digest of Technical Papers International Conference on Consumer Electronics.

[26]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[27]  Richard Corbett,et al.  AROMA: ambient awareness through olfaction in a messaging application , 2004, ICMI '04.

[28]  J. Moody,et al.  Feature Selection Based on Joint Mutual Information , 1999 .

[29]  M K Markey,et al.  Application of the mutual information criterion for feature selection in computer-aided diagnosis. , 2001, Medical physics.

[30]  Donald M. Levy,et al.  A Dynamic Programming Approach to the Selection of Pattern Features , 1968, IEEE Trans. Syst. Sci. Cybern..

[31]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32]  W. Bastiaan Kleijn,et al.  Low-Complexity, Nonintrusive Speech Quality Assessment , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[33]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[34]  Francesc J. Ferri,et al.  Comparative study of techniques for large-scale feature selection* *This work was suported by a SERC grant GR/E 97549. The first author was also supported by a FPI grant from the Spanish MEC, PF92 73546684 , 1994 .

[35]  Michael Egmont-Petersen,et al.  Sequential selection of discrete features for neural networks - A Bayesian approach to building a cascade , 1999, Pattern Recognit. Lett..

[36]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[37]  Benoist Schaal,et al.  Olfaction, Taste, and Cognition: Olfactory Cognition at the Start of Life: The Perinatal Shaping of Selective Odor Responsiveness , 2002 .