A Hybrid Classification Approach Based on Support Vector Machine and K-Nearest Neighbor for Remote Sensing Data

Analysis and classification for remote sensing landscape based on remote sensing imagery is a popular research topic. In this paper, we propose a new remote sensing data classifier by incorporating the support vector machine (SVM) learning information into the K-nearest neighbor (KNN) classifier. The SVM is well known for its extraordinary generalization capability even with limited learning samples, and it is very useful for remote sensing applications as data samples are usually limited. The KNN has been widely used in data classification due to its simplicity and effectiveness. However, the KNN is instance-based and needs to keep all the training samples for classification, which could cause not only high computation complexity but also overfitting problems. Meanwhile, the performance of the KNN classifier is sensitive to the neighborhood size K and how to select the value of the parameter K relies heavily on practice and experience. Based on the observations that the SVM can contribute to the KNN on the problems of smaller training samples size as well as the selection of the parameter K, we propose a support vector nearest neighbor (abbreviated as SV-NN) hybrid classification approach which can simplify the parameter selection while maintaining classification accuracy. The proposed approach is consist of two stages. In the first stage, the SVM is performed on the training samples to obtain the reduced support vectors (SVs) for each of the sample categories. In the second stage, a nearest neighbor classifier (NNC) is used to classify a testing sample, i.e. the average Euclidean distance between the testing data point to each set of SVs from different categories is calculated and the NNC identifies the category with minimum distance. To evaluate the effectiveness of the proposed approach, firstly experiments of classification for samples from remote sensing data are evaluated, and then experiments of identifying different land covers regions in the remote sensing images are evaluated. Experimental results show that the SV-NN approach maintains good classification accuracy while reduces the training samples compared with the conventional SVM and KNN classification model.

[1]  Kalle Ruokolainen,et al.  Using k-nn and discriminant analyses to classify rain forest types in a Landsat TM image over northern Costa Rica , 2008 .

[2]  Eduardo R. Hruschka,et al.  An experimental study on the use of nearest neighbor-based imputation algorithms for classification tasks , 2013, Data Knowl. Eng..

[3]  Xiao Lu,et al.  A Method for Metric Learning with Multiple-Kernel Embedding , 2015, Neural Processing Letters.

[4]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[5]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[6]  Hassan Ghassemian,et al.  Nonparametric feature extraction for classification of hyperspectral images with limited training samples , 2016 .

[7]  Patrick Hostert,et al.  Land cover mapping of large areas using chain classification of neighboring Landsat satellite images , 2009 .

[8]  Andrew O. Finley,et al.  Delineation of forest/nonforest land use classes using nearest neighbor methods , 2004 .

[9]  Xiaorui Ma,et al.  Semisupervised classification for hyperspectral image based on multi-decision labeling and deep feature learning , 2016 .

[10]  Alan H. Strahler,et al.  Maximizing land cover classification accuracies produced by decision trees at continental to global scales , 1999, IEEE Trans. Geosci. Remote. Sens..

[11]  Werner Schneider,et al.  The impact of relative radiometric calibration on the accuracy of kNN-predictions of forest attributes , 2007 .

[12]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[13]  Giorgos Mountrakis,et al.  A meta-analysis of remote sensing research on supervised pixel-based land-cover image classification processes: General guidelines for practitioners and future research , 2016 .

[14]  Przemysław Głomb,et al.  Semi-supervised hyperspectral classification from a small number of training samples using a co-training approach , 2016 .

[15]  Lorenzo Bruzzone,et al.  A Novel Context-Sensitive Semisupervised SVM Classifier Robust to Mislabeled Training Samples , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[17]  Huseyin Gokhan Akcay,et al.  Automatic Detection of Geospatial Objects Using Multiple Hierarchical Segmentations , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Junhao Wen,et al.  Semi-supervised learning combining transductive support vector machine with active learning , 2015, Neurocomputing.

[19]  Taskin Kavzoglu,et al.  A kernel functions analysis for support vector machines for land cover classification , 2009, Int. J. Appl. Earth Obs. Geoinformation.

[20]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote-sensing data with primal SVM for small-sized training dataset problem☆ , 2008 .

[21]  Qian Du,et al.  A survey on representation-based classification and detection in hyperspectral remote sensing imagery , 2016, Pattern Recognit. Lett..

[22]  Qingmin Meng,et al.  K Nearest Neighbor Method for Forest Inventory Using Remote Sensing Data , 2007 .

[23]  Jaehoon Jung,et al.  Effects of national forest inventory plot location error on forest carbon stock estimation using k-nearest neighbor algorithm , 2013 .

[24]  Xin Pan,et al.  A variable precision rough set approach to the remote sensing land use/cover classification , 2010, Comput. Geosci..

[25]  Xi-Zhao Wang,et al.  Improving Generalization of Fuzzy IF--THEN Rules by Maximizing Fuzzy Entropy , 2009, IEEE Transactions on Fuzzy Systems.

[26]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[27]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[28]  Giles M. Foody,et al.  Toward intelligent training of supervised image classifications: directing training data acquisition for SVM classification , 2004 .

[29]  Harry Shum,et al.  Query Dependent Ranking Using K-nearest Neighbor * , 2022 .

[30]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[31]  Giles M. Foody,et al.  Using pure and mixed objects in the training of object-based image classifications , 2016 .

[32]  Jianqiang Gao,et al.  A Novel Spatial Analysis Method for Remote Sensing Image Classification , 2015, Neural Processing Letters.

[33]  Jungho Im,et al.  Support vector machines in remote sensing: A review , 2011 .

[34]  Yee Leung,et al.  A rough set approach to the discovery of classification rules in spatial data , 2007, Int. J. Geogr. Inf. Sci..

[35]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[36]  Mehmet Aci,et al.  K nearest neighbor reinforced expectation maximization method , 2011, Expert Syst. Appl..

[37]  Lei Peng,et al.  Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model , 2016, Comput. Geosci..

[38]  Ronald E. McRoberts,et al.  A two-step nearest neighbors algorithm using satellite imagery for predicting forest structure within species composition classes , 2009 .

[39]  Ashish Ghosh,et al.  Fuzzy clustering algorithms for unsupervised change detection in remote sensing images , 2011, Inf. Sci..

[40]  Xiaodong Liu,et al.  A parsimony fuzzy rule-based classifier using axiomatic fuzzy set theory and support vector machines , 2011, Inf. Sci..