A New Distance Metric Based on Class-Space Reduction

The ultimate goal of research regarding classification is to improve accuracy. Classification accuracy highly depends on overlapping areas among classes of the dataset. In general, a wider overlap area produces less classification accuracy. In this study, we suggest a new distance metric based on class-space reduction to improve classification accuracy. Proposed distance metric has same effect to rescale training/test data by moving data points in the direction of the center point of the class that the data points belong to. By conducting experiments using real datasets, we confirmed that many cases of new dataset generated by class-space reduction improved the classification accuracy for some classification algorithms.

[1]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[2]  Francisco Azuaje,et al.  An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors , 2006, BMC Medical Informatics Decis. Mak..

[3]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[5]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[6]  Sejong Oh A new dataset evaluation method based on category overlap , 2011, Comput. Biol. Medicine.

[7]  Xinping Cui,et al.  Optimized Ranking and Selection Methods for Feature Selection with Application in Microarray Experiments , 2010, Journal of biopharmaceutical statistics.

[8]  Li Chen,et al.  A fuzzy K-nearest-neighbor algorithm to blind image deconvolution , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[9]  Wei Wu,et al.  Evaluation of normalization methods for cDNA microarray data by k-NN classification , 2005, BMC Bioinformatics.

[10]  Xuezeng Pan,et al.  A New Method of Training Sample Selection in Text Classification , 2010, 2010 Second International Workshop on Education Technology and Computer Science.

[11]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Songbo Tan,et al.  Neighbor-weighted K-nearest neighbor for unbalanced text corpus , 2005, Expert Syst. Appl..

[13]  G. Collewet,et al.  Influence of MRI acquisition protocols and image intensity normalization methods on texture classification. , 2004, Magnetic resonance imaging.

[14]  Gabriela Alexe,et al.  A robust meta‐classification strategy for cancer detection from MS data , 2006, Proteomics.