Data-Dimensionality Reduction Using Information-Theoretic Stepwise Feature Selector

A novel information-theoretic stepwise feature selector (ITSFS) is designed to reduce the dimension of diesel engine data. This data consist of 43 sensor measurements acquired from diesel engines that are either in a healthy state or in one of seven different fault states. Using ITSFS, the minimum number of sensors from a pool of 43 sensors is selected so that eight states of the engine can be classified with reasonable accuracy. Various classifiers are trained and tested for fault classification accuracy using the field data before and after dimension reduction by ITSFS. The process of dimension reduction and classification is repeated using other existing dimension reduction techniques such as simulated annealing and regression subset selection. The classification accuracies from these techniques are compared with those obtained by data reduced by the proposed feature selector.

[1]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[2]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[3]  C. B. Bell Mutual Information and Maximal Correlation as Measures of Dependence , 1962 .

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[6]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  H. Akaike A new look at the statistical model identification , 1974 .

[8]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[9]  Nuno Vasconcelos A family of information-theoretic algorithms for low-complexity discriminant feature selection in image retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[10]  C. Mallows More comments on C p , 1995 .

[11]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[15]  Brad Warner,et al.  Understanding Neural Networks as Statistical Tools , 1996 .

[16]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Y. Selen,et al.  Model-order selection: a review of information criterion rules , 2004, IEEE Signal Processing Magazine.

[18]  Mohamed A. Deriche,et al.  An optimal feature selection technique using the concept of mutual information , 2001, Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467).

[19]  C. L. Mallows Some comments on C_p , 1973 .

[20]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[21]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .