Semi-random subspace sampling for classification

In this paper, we introduce a novel semi-random subspace sampling for classification (for short, denoted by FS_RS). In this method, a ranking feature list is obtained by using feature selection first, and then the more important N0 features in the front of the ranking feature list are chosen, and N1 features is randomly selected from the remaining features in the ranking feature list. Along this sampling method, those obtained feature subsets not only contain those more important features, but also include those relatively weak relevant or irrelevant features, hence both diversity and accuracy of corresponding base classifiers can be effectively guaranteed. So, the performance of the integrated classifier can be effectively improved. Experiments on 4 real-life datasets show the effectiveness of our method.

[1]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Yulian Zhu,et al.  Semi-random subspace method for face recognition , 2009, Image Vis. Comput..

[3]  Dimitrios Gunopulos,et al.  Large margin nearest neighbor classifiers , 2005, IEEE Transactions on Neural Networks.

[4]  Yang Ming An Incremental Updating Algorithm for Attribute Reduction Based on Improved Discernibility Matrix , 2007 .

[5]  Nicolás García-Pedrajas,et al.  Constructing Ensembles of Classifiers by Means of Weighted Instance Selection , 2009, IEEE Transactions on Neural Networks.

[6]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[7]  Ming Yang,et al.  A novel condensing tree structure for rough set feature selection , 2008, Neurocomputing.

[8]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[9]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[10]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Yunde Jia,et al.  A linear discriminant analysis framework based on random subspace for face recognition , 2007, Pattern Recognit..

[14]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[15]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[16]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[17]  Raymond J. Mooney,et al.  Creating diversity in ensembles using artificial data , 2005, Inf. Fusion.

[18]  Xiaogang Wang,et al.  Subspace analysis using random mixture models , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[20]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[21]  Christopher J. Merz,et al.  Using Correspondence Analysis to Combine Classifiers , 1999, Machine Learning.

[22]  Xiaogang Wang,et al.  Random Sampling for Subspace Face Recognition , 2006, International Journal of Computer Vision.

[23]  Xiaogang Wang,et al.  Random sampling LDA for face recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24]  Terry Windeatt,et al.  Accuracy/Diversity and Ensemble MLP Classifier Design , 2006, IEEE Transactions on Neural Networks.

[25]  Nitesh V. Chawla,et al.  Random subspaces and subsampling for 2-D face recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).