What’s Spatial About Spatial Data Mining: Three Case Studies

Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful, patterns from large spatial datasets. Extracting interesting and useful patterns from spatial datasets is more difficult than extracting the corresponding patterns from traditional numeric and categorical data due to the complexity of spatial data types, spatial relationships, and spatial autocorrelation. A popular approach is to apply classical data mining techniques after transforming spatial components into non-spatial components via feature selection. An alternative is to explore new models, new objective functions, and new patterns which are more suitable for spatial data and their unique properties. This chapter investigates techniques in the literature to incorporate spatial components via feature selection, new models, new objective functions, and new patterns.

[1]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[2]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[3]  Paul Krugman,et al.  Development, Geography, and Economic Theory , 1995 .

[4]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[5]  J. LeSage Bayesian Estimation of Spatial Autoregressive Models , 1997 .

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  William J. Mitsch,et al.  A spatial habitat model for the marsh-breeding red-winged blackbird (Agelaius phoeniceus L.) in coastal Lake Erie wetlands , 1997 .

[8]  Uygar Özesmi,et al.  An artificial neural network approach to spatial habitat modelling with interspecific interaction , 1999 .

[9]  Shashi Shekhar,et al.  Spatial Databases - Accomplishments and Research Needs , 1999, IEEE Trans. Knowl. Data Eng..

[10]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[11]  Andrew M. Liebhold,et al.  Geostatistical Model for Forecasting Spatial Dynamics of Defoliation Caused by the Gypsy Moth (Lepidoptera: Lymantriidae) , 1993 .

[12]  Raymond T. Ng,et al.  Finding Boundary Shape Matching Relationships in Spatial Data , 1997, SSD.

[13]  Hans-Peter Kriegel,et al.  Spatial Data Mining: A Database Approach , 1997, SSD.

[14]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[15]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[16]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[18]  Rakesh Agrawal Tutorial database mining , 1994, PODS '94.

[19]  R. Haining Spatial Data Analysis in the Social and Environmental Sciences , 1990 .

[20]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[21]  Noel A Cressie,et al.  Statistics for Spatial Data, Revised Edition. , 1994 .

[22]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[23]  P S Albert,et al.  A generalized estimating equations approach for spatially correlated binary data: applications to the analysis of neuroimaging data. , 1995, Biometrics.

[24]  Raymond T. Ng,et al.  Extraction of Spatial Proximity Patterns by Concept Generalization , 1996, KDD.

[25]  John F. Roddick,et al.  A bibliography of temporal, spatial and spatio-temporal data mining research , 1999, SKDD.

[26]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[27]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[28]  Raymond T. Ng,et al.  Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining , 1996, IEEE Trans. Knowl. Data Eng..

[29]  Hisashi Nakamura,et al.  Fast Spatio-Temporal Data Mining of Large Geophysical Datasets , 1995, KDD.

[30]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[31]  P. Rousseeuw,et al.  Computing depth contours of bivariate point clouds , 1996 .

[32]  Subhash R. Lele,et al.  A Regression Method for Spatial Disease Rates: An Estimating Function Approach , 1997 .

[33]  Aidong Zhang,et al.  FindOut: Finding Outliers in Very Large Datasets , 2002, Knowledge and Information Systems.

[34]  C. Potter,et al.  Large-scale impoverishment of Amazonian forests by logging and fire , 1999, Nature.

[35]  Louis Moreau,et al.  Monitoring fire activities in the boreal ecosystem , 1997 .