Hybrid $k$ -Nearest Neighbor Classifier

Conventional k -nearest neighbor (KNN) classification approaches have several limitations when dealing with some problems caused by the special datasets, such as the sparse problem, the imbalance problem, and the noise problem. In this paper, we first perform a brief survey on the recent progress of the KNN classification approaches. Then, the hybrid KNN (HBKNN) classification approach, which takes into account the local and global information of the query sample, is designed to address the problems raised from the special datasets. In the following, the random subspace ensemble framework based on HBKNN (RS-HBKNN) classifier is proposed to perform classification on the datasets with noisy attributes in the high-dimensional space. Finally, the nonparametric tests are proposed to be adopted to compare the proposed method with other classification approaches over multiple datasets. The experiments on the real-world datasets from the Knowledge Extraction based on Evolutionary Learning dataset repository demonstrate that RS-HBKNN works well on real datasets, and outperforms most of the state-of-the-art classification approaches.

[1]  C. A. Murthy,et al.  On visualization and aggregation of nearest neighbor classifiers , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Begoña Acha,et al.  Burn Depth Analysis Using Multidimensional Scaling Applied to Psychophysical Experiment Data , 2013, IEEE Transactions on Medical Imaging.

[5]  Hong Liu,et al.  Coarse to fine K nearest neighbor classifier , 2013, Pattern Recognit. Lett..

[6]  José Francisco Martínez Trinidad,et al.  Fast k most similar neighbor classifier for mixed data (tree k-MSN) , 2010, Pattern Recognition.

[7]  Vandana,et al.  Survey of Nearest Neighbor Techniques , 2010, ArXiv.

[8]  Pradipta Maji,et al.  Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Francisco Herrera,et al.  Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification , 2011, Pattern Recognit..

[10]  Tamer Shanableh,et al.  Spatio-Temporal Feature-Extraction Techniques for Isolated Gesture Recognition in Arabic Sign Language , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Qiangfu Zhao Inducing NNC-trees with the R/sup 4/-rule , 2006, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Yuan Yan Tang,et al.  High-Order Distance-Based Multiview Stochastic Learning in Image Classification , 2014, IEEE Transactions on Cybernetics.

[13]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[14]  Y. Rui,et al.  Learning to Rank Using User Clicks and Visual Features for Image Retrieval , 2015, IEEE Transactions on Cybernetics.

[15]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[16]  Jian Yang,et al.  K-local hyperplane distance nearest neighbor classifier oriented local discriminant analysis , 2013, Inf. Sci..

[17]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[18]  Xuelong Li,et al.  Hessian Regularized Support Vector Machines for Mobile Image Annotation on the Cloud , 2013, IEEE Transactions on Multimedia.

[19]  Simon Lucey,et al.  Nearest neighbor classifier generalization through spatially constrained filters , 2013, Pattern Recognit..

[20]  P. Shanti Sastry,et al.  Fingerprint classification using a feedback-based line detector , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Francisco Herrera,et al.  Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification , 2013, Pattern Recognit..

[22]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Fei Chao,et al.  Feature Selection Inspired Classifier Ensemble Reduction , 2014, IEEE Transactions on Cybernetics.

[24]  Zijiang Yang,et al.  A novel two-level nearest neighbor classification algorithm using an adaptive distance metric , 2012, Knowl. Based Syst..

[25]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Loris Nanni,et al.  An ensemble of K-local hyperplanes for predicting protein-protein interactions , 2006, Bioinform..

[27]  Paul D. Gader,et al.  Detection and Discrimination of Land Mines in Ground-Penetrating Radar Based on Edge Histogram Descriptors and a Possibilistic $K$-Nearest Neighbor Classifier , 2009, IEEE Transactions on Fuzzy Systems.

[28]  Eric Bax,et al.  Validation of $k$-Nearest Neighbor Classifiers , 2012, IEEE Transactions on Information Theory.

[29]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[30]  Masaki Nakagawa,et al.  Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition , 2001, Pattern Recognit..

[31]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Rabab Kreidieh Ward,et al.  Robust Classifiers for Data Reduced via Random Projections , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Jian Yang,et al.  Linear reconstruction measure steered nearest neighbor classification framework , 2014, Pattern Recognit..

[34]  Hamid Soltanian-Zadeh,et al.  Rotation-invariant multiresolution texture analysis using Radon and wavelet transforms , 2005, IEEE Transactions on Image Processing.

[35]  Jen-Tzung Chien,et al.  Discriminant Waveletfaces and Nearest Feature Classifiers for Face Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[37]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[38]  Sergio Bermejo,et al.  Adaptive soft k-nearest-neighbour classifiers , 2000, Pattern Recognit..

[39]  Pao-Ta Yu,et al.  A Nonparametric Feature Extraction and Its Application to Nearest Neighbor Classification for Hyperspectral Image Data , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Daniel Hernández-Lobato,et al.  A Double Pruning Scheme for Boosting Ensembles , 2014, IEEE Transactions on Cybernetics.

[41]  Xuelong Li,et al.  Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters , 2016, IEEE Transactions on Cybernetics.

[42]  Cor J. Veenman,et al.  The nearest subclass classifier: a compromise between the nearest mean and nearest neighbor classifier , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Li Ma,et al.  Local Manifold Learning-Based $k$ -Nearest-Neighbor for Hyperspectral Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[44]  Yu-Lin He,et al.  Non-Naive Bayesian Classifiers for Classification Problems With Continuous Attributes , 2014, IEEE Transactions on Cybernetics.

[45]  Dimitrios Gunopulos,et al.  Large margin nearest neighbor classifiers , 2005, IEEE Transactions on Neural Networks.

[46]  Dianhong Wang,et al.  Survey of Improving K-Nearest-Neighbor for Classification , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[47]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[49]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[50]  C. A. Murthy,et al.  Multiscale Classification Using Nearest Neighbor Density Estimates , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[51]  Sameer Singh,et al.  Nearest-neighbour classifiers in natural scene analysis , 2001, Pattern Recognit..

[52]  William F. Punch,et al.  Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[53]  Yang Yu,et al.  Ensembling local learners ThroughMultimodal perturbation , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[54]  Zhiwen Yu,et al.  Perceptual relativity-based local hyperplane classification , 2012, Neurocomputing.

[55]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[56]  Sarah Jane Delany k-Nearest Neighbour Classifiers , 2007 .

[57]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[58]  Zhu Li,et al.  Grassmann Hashing for approximate nearest neighbor search in high dimensional space , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[59]  Carey E. Priebe,et al.  Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs , 2012, 1207.6745.

[60]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Carlotta Domeniconi,et al.  On Error Correlation and Accuracy of Nearest Neighbor Ensemble Classifiers , 2005, SDM.

[62]  D. Tao,et al.  Hessian-Regularized Co-Training for Social Activity Recognition , 2014, PloS one.

[63]  Qiangfu Zhao Inducing NNC-trees with the R4-rule , 2006, IEEE Trans. Syst. Man Cybern. Part B.

[64]  Andrew P. Bradley,et al.  Nearest neighbour group-based classification , 2010, Pattern Recognit..

[65]  Sebastián Ventura,et al.  Classification via clustering for predicting final marks starting from the student participation in Forums , 2012, EDM.

[66]  Thierry Denoeux A k -Nearest Neighbor Classification Rule Based on Dempster-Shafer Theory , 2008, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[67]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[68]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[70]  Jian Yang,et al.  From classifiers to discriminators: A nearest neighbor rule induced discriminant analysis , 2011, Pattern Recognit..

[71]  Farid Melgani,et al.  Nearest Neighbor Classification of Remote Sensing Images With the Maximal Margin Principle , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[72]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[73]  Pascal Vincent,et al.  K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[74]  David Howard,et al.  A Comparison of Feature Extraction Methods for the Classification of Dynamic Activities From Accelerometer Data , 2009, IEEE Transactions on Biomedical Engineering.

[75]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[76]  Parthasarathy Guturu,et al.  A class of new KNN methods for low sample problems , 1990, IEEE Trans. Syst. Man Cybern..

[77]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Haiping Lu,et al.  Uncorrelated Multilinear Discriminant Analysis With Regularization and Aggregation for Tensor Object Recognition , 2009, IEEE Transactions on Neural Networks.

[79]  Zhi-Hua Zhou,et al.  Supervised nonlinear dimensionality reduction for visualization and classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[80]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[81]  Yan Qiu Chen,et al.  The Nearest Neighbor Algorithm of Local Probability Centers , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[82]  Yuan Yan Tang,et al.  Multiview Hessian discriminative sparse coding for image annotation , 2013, Comput. Vis. Image Underst..

[83]  Hakan Altinçay,et al.  Ensembling evidential k-nearest neighbor classifiers through multi-modal perturbation , 2007, Appl. Soft Comput..

[84]  Hakan Cevikalp,et al.  Nearest hyperdisk methods for high-dimensional classification , 2008, ICML '08.

[85]  Dacheng Tao,et al.  Multi-View Intact Space Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[86]  Rachid Aissaoui,et al.  Automatic Classification of Asymptomatic and Osteoarthritis Knee Gait Patterns Using Kinematic Data Features and the Nearest Neighbor Classifier , 2008, IEEE Transactions on Biomedical Engineering.

[87]  Dacheng Tao,et al.  Subspaces Indexing Model on Grassmann Manifold for Image Search , 2011, IEEE Transactions on Image Processing.

[88]  Francisco Herrera,et al.  Integrating Instance Selection, Instance Weighting, and Feature Weighting for Nearest Neighbor Classifiers by Coevolutionary Algorithms , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[89]  Juan José Rodríguez Diez,et al.  Classifier Ensembles with a Random Linear Oracle , 2007, IEEE Transactions on Knowledge and Data Engineering.

[90]  Carlotta Domeniconi,et al.  Nearest neighbor ensemble , 2004, ICPR 2004.

[91]  M. Narasimha Murty,et al.  Overlap pattern synthesis with an efficient nearest neighbor classifier , 2005, Pattern Recognit..

[92]  Ricardo Gutierrez-Osuna,et al.  A method for evaluating data-preprocessing techniques for odour classification with an array of gas sensors , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[93]  Zhiwen Yu,et al.  Hybrid Adaptive Classifier Ensemble , 2015, IEEE Transactions on Cybernetics.

[94]  Qinghua Hu,et al.  Large-margin nearest neighbor classifiers via sample weight learning , 2011, Neurocomputing.

[95]  Zheng-Zhi Wang,et al.  Center-based nearest neighbor classifier , 2007, Pattern Recognit..

[96]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[97]  Zili Zhang,et al.  Sample Subset Optimization Techniques for Imbalanced and Ensemble Learning Problems in Bioinformatics Applications , 2014, IEEE Transactions on Cybernetics.

[98]  Wenming Zheng,et al.  Locally nearest neighbor classifiers for pattern classification , 2004, Pattern Recognit..

[99]  Jun Yu,et al.  Click Prediction for Web Image Reranking Using Multimodal Sparse Coding , 2014, IEEE Transactions on Image Processing.

[100]  Sargur N. Srihari,et al.  Fast k-nearest neighbor classification using cluster-based trees , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[101]  Enrico Blanzieri,et al.  About Neighborhood Counting Measure Metric and Minimum Risk Metric , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[102]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[103]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .