Developing a parallel computational implementation of AMOEBA

As geospatial researchers' access to high-performance computing clusters continues to increase alongside the availability of high-resolution spatial data, it is imperative that techniques are devised to exploit these clusters' ability to quickly process and analyze large amounts of information. This research concentrates on the parallel computation of A Multidirectional Optimal Ecotope-Based Algorithm (AMOEBA). AMOEBA is used to derive spatial weight matrices for spatial autoregressive models and as a method for identifying irregularly shaped spatial clusters. While improvements have been made to the original ‘exhaustive’ algorithm, the resulting ‘constructive’ algorithm can still take a significant amount of time to complete with large datasets. This article outlines a parallel implementation of AMOEBA (the P-AMOEBA) written in Java utilizing the message passing library MPJ Express. In order to account for differing types of spatial grid data, two decomposition methods are developed and tested. The benefits of using the new parallel algorithm are demonstrated on an example dataset. Results show that different decompositions of spatial data affect the computational load balance across multiple processors and that the parallel version of AMOEBA achieves substantially faster runtimes than those reported in related publications.

[1]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[2]  Shaowen Wang,et al.  TeraGrid GIScience Gateway: Bridging cyberinfrastructure and GIScience , 2009, Int. J. Geogr. Inf. Sci..

[3]  Le Wang,et al.  Retrieval of subpixel Tamarix canopy cover from Landsat data along the Forgotten River using linear and nonlinear spectral mixture models , 2010 .

[4]  Marc P. Armstrong,et al.  Geography and Computational Science , 2000 .

[5]  Valerie Guralnik,et al.  A scalable algorithm for clustering sequential data , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[6]  Yuemin Ding,et al.  Spatial Strategies for Parallel Spatial Modelling , 1996, Int. J. Geogr. Inf. Sci..

[7]  Cheng-Zhong Xu,et al.  Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[8]  Marc P. Armstrong,et al.  Parallel processing of spatial statistics , 1994 .

[9]  Keith C. Clarke,et al.  A general-purpose parallel raster processing programming library test application using a geographic cellular automata model , 2010, Int. J. Geogr. Inf. Sci..

[10]  Britton Harris,et al.  Some Notes on Parallel Computing: With Special Reference to Transportation and Land-Use Modeling , 1985 .

[11]  Richard Healey,et al.  Parallel Processing Algorithms for GIS , 1997 .

[12]  Demin Xiong,et al.  Strategies for Real-Time Spatial Analysis Using Massively Parallel SIMD Cpmputers: An Application to Urban Traffic Flow Analysis , 1996, Int. J. Geogr. Inf. Sci..

[13]  Alejandro Betancourt,et al.  A computationally efficient method for delineating irregularly shaped spatial clusters , 2011, J. Geogr. Syst..

[14]  Yan Liu,et al.  SimpleGrid toolkit: Enabling geosciences gateways to cyberinfrastructure , 2009, Comput. Geosci..

[15]  Daniel A. Griffith,et al.  SUPERCOMPUTING AND SPATIAL STATISTICS: A RECONNAISSANCE* , 1990 .

[16]  B. C. Hazen,et al.  Lucas: a system for modeling land-use change , 1996 .

[17]  Tavi Murray Computational intelligence techniques in geography. An introduction , 1999, J. Geogr. Syst..

[18]  Michael S. Scott,et al.  Improving the Performance of Raster GIS: A Comparison of Approaches to Parallelization of Cost Volume Algorithms , 2008 .

[19]  Evangelos E. Milios,et al.  Clustering event logs using iterative partitioning , 2009, KDD.

[20]  Steve Dowers,et al.  Parallel processing for geographical applications: A layered approach , 1999, J. Geogr. Syst..

[21]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[22]  A. Getis,et al.  Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters , 2006 .

[23]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[24]  Mark Baker,et al.  MPJ Express: Towards Thread Safe Java HPC , 2006, 2006 IEEE International Conference on Cluster Computing.

[25]  Shaowen Wang,et al.  Grid computing of spatial statistics: using the TeraGrid for G i * (d) analysis , 2008 .

[26]  Michael F. Goodchild,et al.  Scale in a digital geographic world , 1997 .

[27]  Britton Harris,et al.  Computing in Planning: Professional and Institutional Requirements , 1999 .

[28]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .