Data Mining Techniques for Autonomous Exploration of Large Volumes of Geo-referenced Crime Data

We incorporate two knowledge discovery techniques, clustering and association-rule mining, into a fruitful exploratory tool for the discovery of spatio-temporal patterns. This tool is an autonomous pattern detector to reveal plausible cause-effect associations between layers of point and area data. We present two methods for this exploratory analysis and we detail algorithms to effectively explore geo-referenced data. We illustrate the algorithms with real crime data. We demonstrate our approach to a new type of analysis of the spatio-temporal dimensions of records of criminal events. We hope this will lead to new approaches in the exploration of large volumes of spatio-temporal data.

[1]  Julian Besag,et al.  The Detection of Clusters in Rare Diseases , 1991 .

[2]  Roger Marshall,et al.  A Review of Methods for the Statistical Analysis of Spatial Patterns of Disease , 1991 .

[3]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[4]  Ki-Joune Li,et al.  A spatial data mining method by Delaunay triangulation , 1997, GIS '97.

[5]  Umeshwar Dayal,et al.  K-Harmonic Means - A Spatial Clustering Algorithm with Boosting , 2000, TSDM.

[6]  Ickjai Lee,et al.  AUTOCLUST+: Automatic Clustering of Point-Data Sets in the Presence of Obstacles , 2000, TSDM.

[7]  Martin Charlton,et al.  A Mark 1 Geographical Analysis Machine for the automated analysis of point data sets , 1987, Int. J. Geogr. Inf. Sci..

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[10]  Alan T. Murray,et al.  Integrating attribute and space characteristics in choropleth display and spatial data mining , 2000, Int. J. Geogr. Inf. Sci..

[11]  Raymond T. Ng,et al.  Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining , 1996, IEEE Trans. Knowl. Data Eng..

[12]  Jiawei Han,et al.  Meta-Rule-Guided Mining of Association Rules in Relational Databases , 1995, KDOOD/TDOOD.

[13]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[14]  Stan Openshaw,et al.  Two exploratory space-time-attribute pattern analysers relevant to GIS , 1994 .

[15]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[16]  Xin Yao,et al.  Application of Genetic Algorithm and k-Nearest Neighbour Method in Medical Fraud Detection , 1998, SEAL.

[17]  Alan T. Murray,et al.  Exploratory Spatial Data Analysis Techniques for Examining Urban Crime , 2001 .

[18]  Craig Eldershaw,et al.  Cluster Analysis using Triangulation , 1997 .

[19]  Graham J. Williams Evolutionary Hot Spots Data Mining - An Architecture for Exploring for Interesting Discoveries , 1999, PAKDD.

[20]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[21]  Vladimir Estivill-Castro,et al.  Discovering Associations in Spatial Data - An Efficient Medoid Based Approach , 1998, PAKDD.

[22]  Kurt Mehlhorn,et al.  LEDA: a platform for combinatorial and geometric computing , 1997, CACM.

[23]  Raymond T. Ng,et al.  Finding Boundary Shape Matching Relationships in Spatial Data , 1997, SSD.

[24]  Ickjai Lee,et al.  AMOEBA: HIERARCHICAL CLUSTERING BASED ON SPATIAL PROXIMITY USING DELAUNATY DIAGRAM , 2000 .

[25]  Stan Openshaw,et al.  Applying Geocomputation to the Analysis of Spatial Distributions , 1999 .

[26]  Robert J. Stimson,et al.  City profile: Brisbane , 1999 .

[27]  Michael E. Houle,et al.  Robust Clustering of Large Geo-referenced Data Sets , 1999, PAKDD.

[28]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[29]  Peter Brown,et al.  GIS and the Analysis of Spatially-Referenced Crime Data: Experiences in Merseyside. U. K , 1995, Int. J. Geogr. Inf. Sci..

[30]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[31]  Jiong Yang,et al.  STING+: an approach to active spatial data mining , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[32]  Ickjai Lee,et al.  AUTOCLUST: Automatic Clustering via Boundary Extraction for Mining Massive Point-Data Sets , 2000 .

[33]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[34]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[35]  Trevor C. Bailey,et al.  Interactive Spatial Data Analysis , 1995 .

[36]  Graham F. Carey,et al.  Book reviewComputational techniques and applications, CTAC-83: J. Noye and C. Fletcher, eds. (North-Holland, Amsterdam, 1984), 982 pp., ISBN 0 444 875271 , 1985 .