Classifier Committee Based on Feature Selection Method for Obstructive Nephropathy Diagnosis

The article presents a multiple classifiers approach to the obstructive nephropathy recognition - a disease posing a significant threat to newborns. Nature of the data reflects a problem known as high dimensionality small sample size. In presented approach a feature space division amongst number of classifiers is used to balance the relation between the number of objects and the number of features. Methods of feature selection are apllied for optimum splitting the feature space for classifier ensemble. The optimal size of subspaces and selection of classifier for ensemble is thoroughly tested. Complex performance test are presented to highlight the most efficent tuning of parameters for the presented approach, which is then compared to the classical solutions in this field.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Darrell Whitley,et al.  Feature Selection Mechanisms for Ensemble Creation : A Genetic Search Perspective , 2003 .

[3]  Emmanuel Barillot,et al.  Classification of microarray data using gene networks , 2007, BMC Bioinformatics.

[4]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[5]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  Robert Burduk,et al.  Classification error in Bayes multistage recognition task with fuzzy observations , 2010, Pattern Analysis and Applications.

[8]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[9]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[10]  Edward P. Markowski,et al.  Conditions for the Effectiveness of a Preliminary Test of Variance , 1990 .

[11]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.

[12]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[13]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[14]  Robert Tibshirani,et al.  Margin Trees for High-dimensional Classification , 2007, J. Mach. Learn. Res..

[15]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[17]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[18]  Lior Rokach,et al.  Pattern Classification Using Ensemble Methods , 2009, Series in Machine Perception and Artificial Intelligence.

[19]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[20]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[21]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[24]  Kurt Hornik,et al.  Support Vector Machines in R , 2006 .

[25]  Li Liu,et al.  Improved breast cancer prognosis through the combination of clinical and genetic markers , 2007, Bioinform..

[26]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[27]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[28]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[29]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[31]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[32]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..