A new approach to the nearest‐neighbour method to discover cluster features in overlaid spatial point processes

When two spatial point processes are overlaid, the one with the higher rate is shown as clustered points, and the other one with the lower rate is often perceived to be background. Usually, we consider the clustered points as feature and the background as noise. Revealing these point clusters allows us to further examine and understand the spatial point process. Two important aspects in discerning spatial cluster features from a set of points are the removal of noise and the determination of the number of spatial clusters. Until now, few methods were able to deal with these two aspects at the same time in an automated way. In this study, we combine the nearest‐neighbour (NN) method and the concept of density‐connected to address these two aspects. First, the removal of noise can be achieved using the NN method; then, the number of clusters can be determined by finding the density‐connected clusters. The complexity for finding density‐connected clusters is reduced in our algorithm. Since the number of clusters depends on the value of k (the kth nearest neighbour), we introduce the concept of lifetime for the number of clusters in order to measure how stable the segmentation results (or number of clusters) are. The number of clusters with the longest lifetime is considered to be the final number of clusters. Finally, a seismic example of the west part of China is used as a case study to examine the validity of our method. In this seismic case study, we discovered three seismic clusters: one as the foreshocks of the Songpan quake (M = 7.2), and the other two as aftershocks related to the Kangding‐Jiulong (M = 6.2) quake and Daguan quake (M = 7.1), respectively. Through this case study, we conclude that the approach we proposed is effective in removing noise and determining the number of feature clusters.

[1]  Han Wei-bin,et al.  Crustal structure beneath the Songpan——Garze orogenic belt , 2003 .

[2]  Akira Hasegawa,et al.  Foreshock and Aftershock Sequence of the 1998 M 5.0 Sendai, Northeastern Japan, Earthquake and Its Implications for Earthquake Nucleation , 2002 .

[3]  张 肇诚 中国震例 = Earthquake cases in China , 1988 .

[4]  吴建平,et al.  Crustal structure beneath the Songpan—Garze orogenic belt , 2003 .

[5]  Isabelle Thomas,et al.  Intra-urban location and clustering of road accidents using GIS: a Belgian example , 2004, Int. J. Geogr. Inf. Sci..

[6]  A. Raftery,et al.  Detecting features in spatial point processes with clutter via model-based clustering , 1998 .

[7]  骆剑承,et al.  Multi—scale expression of spatial activity anomalies of earthquakes and its indicative significance on the space and time attributes of strong earthquakes , 2003 .

[8]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[9]  Maurizio Ripepe,et al.  Foreshock sequence of September 26th, 1997 Umbria-Marche earthquakes , 2000 .

[10]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[11]  Allan J. Brimicombe,et al.  A Variable Resolution Approach to Cluster Discovery in Spatial Data Mining , 2003, ICCSA.

[12]  C. Fraley,et al.  Nonparametric Maximum Likelihood Estimation of Features in Spatial Point Processes Using Voronoï Tessellation , 1997 .

[13]  Y. Chen,et al.  Pattern Characteristics of Foreshock Sequences , 1999 .

[14]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[15]  Paul A. Reasenberg,et al.  Foreshock Occurrence Rates before Large Earthquakes Worldwide , 1999 .

[16]  D. Massart,et al.  Looking for natural patterns in data: Part 1. Density-based approach , 2001 .

[17]  A. Raftery,et al.  Nearest-Neighbor Clutter Removal for Estimating Features in Spatial Point Processes , 1998 .

[18]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[19]  M. Kulldorff,et al.  A geographic analysis of prostate cancer mortality in the United States, 1970–89 , 2002, International journal of cancer.

[20]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[21]  Yuzo Toya,et al.  Is Background Seismicity Produced at a Stationary Poissonian Rate , 2000 .

[22]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[23]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[24]  Jiawei Han,et al.  Spatial clustering methods in data mining , 2001 .