Design and implementation of a parallel geographically weighted k-nearest neighbor classifier

Abstract The development of high-performance classifiers represents an important step in improving the timeliness of remote sensing classification in the era of high spatial resolution. The geographically weighted k-nearest neighbors (gwk-NN) classifier, which incorporates spatial information into the traditional k-NN classifier, has demonstrated better performance in mitigating salt-and-pepper noise and misclassification. However, the integration of spatial dependence into spectral information is computationally intensive. To improve the computing performance of the gwk-NN classifier, this study first considered two commonly used parallel strategies—data parallelism and task parallelism—in the model training and image classification stages. Then, our implementation of the corresponding parallel algorithms was carried out by calling message passing interface (MPI) and the geospatial data abstraction library (GDAL) in the C++ development environment on a standalone eight-core computer. Based on the performance of these two strategies, the potentiality of dual parallelism (the simultaneous exploitation of data and task parallelism) in image classification was further investigated. Our experimental results indicate that the parallel gwk-NN classifier can improve the classification efficiency of high-resolution remote sensing images with multiple land cover types. Specifically, the data parallelism method is more effective than the task parallelism method in both the model training and classification stages because of the minor effect of parallel overhead on the total execution time. In addition, dual parallelism can take advantage of data and task parallel strategies, as evidenced by the two largest speedups being attained under dual parallelism I (5.28 × ), which is based on the premise of task parallelism, and dual parallelism II (5.73 × ), in which the priority is given to data decomposition. Comparatively, dual parallelism II provides the best performance by overlapping computation and data transmission, which is compatible with the current trend toward multicore architectures.

[1]  Albert Y. Zomaya,et al.  Remote sensing big data computing: Challenges and opportunities , 2015, Future Gener. Comput. Syst..

[2]  G. Ramstein,et al.  Analysis of the structure of radiometric remotely-sensed images , 1989 .

[3]  Marvin E. Bauer,et al.  Integrating Contextual Information with per-Pixel Classification for Improved Land Cover Classification , 2000 .

[4]  Piermaria Corona,et al.  Estimation of Mediterranean forest attributes by the application of k‐NN procedures to multitemporal Landsat ETM+ images , 2005 .

[5]  Vipin Kumar,et al.  Isoefficiency: measuring the scalability of parallel algorithms and architectures , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[6]  R. Kettig,et al.  Classification of Multispectral Image Data by Extraction and Classification of Homogeneous Objects , 1976, IEEE Transactions on Geoscience Electronics.

[7]  Chein-I Chang,et al.  High Performance Computing in Remote Sensing , 2007, HiPC 2007.

[8]  Yong Ge,et al.  Integrating Object Boundary in Super-Resolution Land-Cover Mapping , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[9]  Sachin S. Sapatnekar,et al.  A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers , 1997, IEEE Trans. Parallel Distributed Syst..

[10]  C. Woodcock,et al.  The use of variograms in remote sensing: I , 1988 .

[11]  Thomas Blaschke,et al.  Object based image analysis for remote sensing , 2010 .

[12]  Paul M. Mather,et al.  Classification of multisource remote sensing imagery using a genetic algorithm and Markov random fields , 1999, IEEE Trans. Geosci. Remote. Sens..

[13]  J. Dungan Spatial prediction of vegetation quantities using ground and image data , 1998 .

[14]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[15]  Alan H. Karp,et al.  Measuring parallel processor performance , 1990, CACM.

[16]  Jaspal Subhlok,et al.  Optimal Use of Mixed Task and Data Parallelism for Pipelined Computations , 2000, J. Parallel Distributed Comput..

[17]  Peter M. Atkinson,et al.  Geostatistics and remote sensing , 1998 .

[18]  J. Anthony Gualtieri,et al.  A Parallel Processing Algorithm for Remote Sensing Classification , 2005 .

[19]  P. Atkinson Spatially weighted supervised classification for remote sensing , 2004 .

[20]  Christine Pohl,et al.  Remote Sensing Image Fusion: A Practical Guide , 2016 .

[21]  Wei Jie,et al.  A review of parallel computing for large-scale remote sensing image mosaicking , 2015, Cluster Computing.

[22]  Antonio J. Plaza,et al.  Parallel Hyperspectral Image and Signal Processing [Applications Corner] , 2011, IEEE Signal Processing Magazine.

[23]  Jiali Shang,et al.  Spatial relationship-assisted classification from high-resolution remote sensing imagery , 2015, Int. J. Digit. Earth.

[24]  Qian Du,et al.  High Performance Computing for Hyperspectral Remote Sensing , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[25]  Jakob J. van Zyl Application of satellite remote sensing data to the monitoring of global resources , 2012 .

[26]  Antonio J. Plaza,et al.  Special issue on architectures and techniques for real-time processing of remotely sensed images , 2009, Journal of Real-Time Image Processing.

[27]  P. Atkinson,et al.  A Geostatistically Weighted k -NN Classifier for Remotely Sensed Imagery , 2010 .

[28]  Chenghu Zhou,et al.  A strategy for raster-based geocomputation under different parallel computing platforms , 2014, Int. J. Geogr. Inf. Sci..

[29]  Philip Lewis,et al.  Geostatistical classification for remote sensing: an introduction , 2000 .

[30]  Kazuto Kubota,et al.  Parallelization of decision tree algorithm and its performance evaluation , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[31]  Geoffrey J. Hay,et al.  An object-specific image-texture analysis of H-resolution forest imagery☆ , 1996 .

[32]  Antonio J. Plaza,et al.  Recent Developments in High Performance Computing for Remote Sensing: A Review , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[33]  Cristina Nicolescu,et al.  A data and task parallel image processing environment , 2002, Parallel Comput..

[34]  Chong Ho Lee,et al.  A FPGA-based parallel semi-naive Bayes classifier implementation , 2013, IEICE Electron. Express.

[35]  Narumasa Tsutsumida,et al.  Improving land cover classification using input variables derived from a geographically weighted principal components analysis , 2016 .

[36]  Sebastiano B. Serpico,et al.  A parallel network of modified 1-NN and k-NN classifiers - Application to remote-sensing image classification , 1998, Pattern Recognit. Lett..

[37]  Le Wang,et al.  A survey of methods incorporating spatial information in image classification and spectral unmixing , 2016 .

[38]  P. Curran The semivariogram in remote sensing: An introduction , 1988 .

[39]  Ute Christina Herzfeld,et al.  Automated geostatistical seafloor classification—principles, parameters, feature vectors, and discrimination criteria , 1996 .

[40]  Shigeo Orii Metrics for evaluation of parallel efficiency toward highly parallel processing , 2010, Parallel Comput..

[41]  Weijia Li,et al.  Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[42]  Luo Jiancheng,et al.  Remote sensing image classification method supported by spatial adjacency , 2011 .

[43]  Aboul Ella Hassanien,et al.  Remote Sensing Image Fusion Approach Based on Brovey and Wavelets Transforms , 2014, IBICA.

[44]  S. M. Jong,et al.  Remote Sensing Image Analysis: Including The Spatial Domain , 2011 .

[45]  Zbigniew J. Czech,et al.  Introduction to Parallel Computing , 2017 .

[46]  Ryutaro Tateishi,et al.  Using geographically weighted variables for image classification , 2012 .

[47]  C. Woodcock,et al.  The use of variograms in remote sensing. I - Scene models and simulated images. II - Real digital images , 1988 .

[48]  William Gropp,et al.  The MPI Message-Passing Interface Standard: Overview and Status , 1995 .

[49]  Patricia Gober,et al.  Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery , 2011, Remote Sensing of Environment.

[50]  Bing Zhang,et al.  A Review of Remote Sensing Image Classification Techniques: the Role of Spatio-contextual Information , 2014 .