Unsupervised image classification over supercomputers Kraken, Keeneland and Beacon

The iterative self-organizing data analysis technique algorithm (ISODATA) was implemented over supercomputers Kraken, Keeneland and Beacon to explore scalable and high-performance solutions for image processing and analytics using emerging advanced computer architectures. When 10 classes are extracted from one 18-GB image tile, the calculation can be reduced from several hours to no more than 90 seconds when 100 CPU, GPU or MIC processors are utilized. High-performance scalability tests were further implemented over Kraken using 10,800 processors to extract various number of classes from 12 image tiles totalling 216 gigabytes. As the first geospatial computations over GPU clusters (Keeneland) and MIC clusters (Beacon), the success of this research illustrates a solid foundation for exploring the potential of scalable and high-performance geospatial computation for the next generation cyber-enabled image analytics.

[1]  Xuan Shi,et al.  Parallelizing ISODATA Algorithm for Unsupervised Image Classification on GPU , 2013 .

[2]  Russell G. Congalton,et al.  Remote Sensing: An Overview , 2010 .

[3]  Jie Li,et al.  eScience in the cloud: A MODIS satellite data reprojection and reduction pipeline in the Windows Azure platform , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[4]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[5]  Antonio J. Plaza,et al.  Improving the Performance of Hyperspectral Image and Signal Processing Algorithms Using Parallel, Distributed and Specialized Hardware-Based Systems , 2010, J. Signal Process. Syst..

[6]  John R. Jensen,et al.  Introductory Digital Image Processing: A Remote Sensing Perspective , 1986 .

[7]  Imtiaz Ahmad,et al.  D-ISODATA: A Distributed Algorithm for Unsupervised Classification of Remotely Sensed Data on Network of Workstations , 1999, J. Parallel Distributed Comput..

[8]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[9]  Anthony M. Filippi,et al.  Hyperspectral Aquatic Radiative Transfer Modeling Using a High-Performance Cluster Computing-Based Approach , 2012 .

[10]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[11]  Lizhe Wang,et al.  Towards building a multi‐datacenter infrastructure for massive remote sensing image processing , 2013, Concurr. Comput. Pract. Exp..

[12]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[13]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[14]  A-Xing Zhu,et al.  How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? , 2014, Trans. GIS.

[15]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[16]  Ming-Hsiang Tsou,et al.  Please Scroll down for Article International Journal of Geographical Information Science Developing a Grid-enabled Spatial Web Portal for Internet Giservices and Geospatial Cyberinfrastructure Developing a Grid-enabled Spatial Web Portal for Internet Giservices and Geospatial Cyberinfrastructure , 2022 .

[17]  Weiqi Zhou,et al.  A Comparison of Object-Oriented Image Classification and Transect Sampling Methods for Obtaining Land Cover Information from Digital Orthophotography , 2011 .

[18]  Xuejun Yang,et al.  Services for Parallel Remote-Sensing Image Processing Based on Computational Grid , 2004, GCC Workshops.

[19]  Christina Freytag,et al.  Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .

[20]  Keith T. Weber,et al.  Improving Classification Accuracy Assessments with Statistical Bootstrap Resampling Techniques , 2007 .

[21]  Ying Luo,et al.  Preliminary Study on Unsupervised Classification of Remotely Sensed Images on the Grid , 2004, International Conference on Computational Science.

[22]  Antonio J. Plaza,et al.  Use of FPGA or GPU-based architectures for remotely sensed hyperspectral image processing , 2013, Integr..

[23]  Steven Tuecke,et al.  The Anatomy of the Grid , 2003 .

[24]  Karsten Jacobsen,et al.  CHARACTERISTICS OF VERY HIGH RESOLUTION OPTICAL SATELLITES FOR TOPOGRAPHIC MAPPING , 2012 .

[25]  Yanbing Tang,et al.  A Hybrid Approach for Land Use/Land Cover Classification , 2009 .

[26]  Prasad Pathak,et al.  Comparison of Digital Image Processing Techniques for Classifying Arctic Tundra , 2010 .

[27]  Antonio J. Plaza,et al.  Clusters Versus FPGA for Parallel Processing of Hyperspectral Imagery , 2008, Int. J. High Perform. Comput. Appl..

[28]  Antonio J. Plaza,et al.  Parallel Processing of Remotely Sensed Hyperspectral Images On Heterogeneous Networks of Workstations Using HeteroMPI , 2008, Int. J. High Perform. Comput. Appl..

[29]  Michael F. Goodchild,et al.  Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? , 2011, Int. J. Digit. Earth.

[30]  D. Gorgan,et al.  MedioGrid: A Grid-based Platform for Satellite Image Processing , 2007, 2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications.

[31]  Antonio J. Plaza,et al.  Clusters versus GPUs for Parallel Target and Anomaly Detection in Hyperspectral Images , 2010, EURASIP J. Adv. Signal Process..

[32]  Peter S. Pacheco Parallel programming with MPI , 1996 .

[33]  Luigi Fusco,et al.  Grid technology for the storage and processing of remote sensing data: description of an application , 2003, SPIE Remote Sensing.