SOMSO: A self-organizing map approach for spatial outlier detection with multiple attributes

In this paper, we propose a self-organizing map approach for spatial outlier detection, the SOMSO method. Spatial outliers are abnormal data points which have significantly distinct non-spatial attribute values compared with their neighborhood. Detection of spatial outliers can further discover spatial distribution and attribute information for data mining problems. Self-Organizing map (SOM) is an effective method for visualization and cluster of high dimensional data. It can preserve intrinsic topological and metric relationships in datasets. The SOMSO method can solve high dimensional problems for spatial attributes and accurately detect spatial outliers with irregular features. The experimental results for the dataset based on U.S. population census indicate that SOMSO approach can successfully be applied in complicated spatial datasets with multiple attributes.

[1]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[2]  Jiong Yang,et al.  An Approach to Active Spatial Data Mining Based on Statistical Information , 2000, IEEE Trans. Knowl. Data Eng..

[3]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[4]  Shashi Shekhar,et al.  A Unified Approach to Detecting Spatial Outliers , 2003, GeoInformatica.

[5]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[6]  C. Lu A Uniied Approach to Spatial Outliers Detection , 2003 .

[7]  A. Madansky Identification of Outliers , 1988 .

[8]  John A. Richards,et al.  Fast k-NN classification using the cluster-space approach , 2005, IEEE Geoscience and Remote Sensing Letters.

[9]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[10]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[11]  Erkki Oja,et al.  Engineering applications of the self-organizing map , 1996, Proc. IEEE.

[12]  Shashi Shekhar,et al.  Data mining for selective visualization of large spatial datasets , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[13]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[14]  Chang-Tien Lu,et al.  Detecting spatial outliers with multiple attributes , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[15]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[16]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[17]  Shashi Shekhar,et al.  Detecting graph-based spatial outliers , 2002, Intell. Data Anal..

[18]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[19]  A. Meredith,et al.  UNITED STATES DEPARTMENT OF COMMERCE , 1999 .

[20]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[21]  Hans-Peter Kriegel,et al.  OPTICS-OF: Identifying Local Outliers , 1999, PKDD.