Feature Selection Using Stochastic Search: An Application to System Identification

System identification using multiple-model strategies may involve thousands of models with several parameters. However, only a few models are close to the correct model. A key task involves finding which parameters are important for explaining candidate models. The application of feature selection to system identification is studied in this paper. A new feature selection algorithm is proposed. It is based on the wrapper approach and combines two algorithms. The search is performed using stochastic sampling and the classification uses a support vector machine strategy. This approach is found to be better than genetic algorithm-based strategies for feature selection on several benchmark data sets. Applied to system identification, the algorithm supports subsequent decision making.

[1]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Haifeng Chen,et al.  Nonlinear Feature Selection by Relevance Feature Vector Machine , 2007, MLDM.

[3]  Sandro Saitta,et al.  Data mining methodologies for supporting engineers during system identification , 2008 .

[4]  Kezhi Mao,et al.  Feature subset selection for support vector machines through discriminative function pruning analysis , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[6]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[7]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[8]  Alain Rakotomamonjy,et al.  Variable Selection Using SVM-based Criteria , 2003, J. Mach. Learn. Res..

[9]  Ian F. C. Smith,et al.  Model Identification of Bridges Using Measurement Data , 2005 .

[10]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11]  Ioannis Pitas,et al.  Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines , 2007, IEEE Transactions on Image Processing.

[12]  Vasant G Honavar,et al.  Feature Subset Selection Using a Genetic Algorithm Feature Subset Selection Using a Genetic Algorithm , 1998 .

[13]  Ian F. C. Smith,et al.  Improving System Identification Using Clustering , 2008 .

[14]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[15]  Lennart Ljung,et al.  System identification (2nd ed.): theory for the user , 1999 .

[16]  Bernhard Schölkopf,et al.  A Compression Approach to Support Vector Model Selection , 2004, J. Mach. Learn. Res..

[17]  Ian F. C. Smith,et al.  Model-free data interpretation for continuous monitoring of complex structures , 2008, Adv. Eng. Informatics.

[18]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[19]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[20]  Ian F. C. Smith,et al.  Engineering Applications of a Direct Search Algorithm, PGSL , 2005 .

[21]  Charles R. Farrar,et al.  Comparative study of damage identification algorithms applied to a bridge: I. Experiment , 1998 .

[22]  Ahmet E. Aktan,et al.  Limitations in Structural Identification of Large Constructed Structures , 2007 .

[23]  Ron Kohavi,et al.  The Wrapper Approach , 1998 .

[24]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[25]  Ian F. C. Smith,et al.  Data mining techniques for improving the reliability of system identification , 2005, Adv. Eng. Informatics.

[26]  Carl Gold,et al.  Model selection for support vector machine classification , 2002, Neurocomputing.

[27]  Ian F. C. Smith,et al.  System Identification through Model Composition and Stochastic Search , 2005 .

[28]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[29]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[30]  Shih-Wei Lin,et al.  A SA-Based Feature Selection and Parameter Optimization Approach for Support Vector Machine , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[31]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[32]  Masoud Sanayei,et al.  STRUCTURAL MODEL UPDATING USING EXPERIMENTAL STATIC MEASUREMENTS , 1997 .

[33]  Pei-Wen Huang,et al.  Using Genetic Algorithms for Feature Selection in Predicting Financial Distresses with Support Vector Machines , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[34]  Ian F. C. Smith,et al.  A direct stochastic algorithm for global search , 2003, Appl. Math. Comput..

[35]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[36]  Ian F. C. Smith,et al.  Fundamentals of Computer-Aided Engineering , 2003 .

[37]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[38]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[39]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[40]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[41]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[42]  B. Domer,et al.  A study of two stochastic search methods for structural control , 2003 .

[43]  Ying Chen,et al.  Efficient text classification by weighted proximal SVM , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[44]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[45]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[46]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.