A Parallel Formulation of the Spatial Auto-Regression Model for Mining Large GeoSpatial Datasets

The spatial auto-regression model (SAM) is a popula r spatial data mining technique which has been used i n many applications with geo-spatial datasets. Howeve r, serial procedures for estimating SAM parameters are computationally expensive due to the need to comput e all the eigenvalues of a very large matrix. We prop ose a parallel formulation of the SAM parameter estimatio n procedure in this paper using data parallelism and hybrid programming technique. Experimental results on an I BM Regatta show that the proposed parallel formulation achieves a speedup of up to 7 on 8 processors. We a re developing algebraic cost models to analyze the experimental results to further improve the speedup s.