Spatial outlier detection based on iterative self-organizing learning model

In this paper, we propose an iterative self-organizing map (SOM) approach with robust distance estimation (ISOMRD) for spatial outlier detection. Generally speaking, spatial outliers are irregular data instances which have significantly distinct non-spatial attribute values compared to their spatial neighbors. In our proposed approach, we adopt SOM to preserve the intrinsic topological and metric relationships of the data distribution to seek reasonable spatial clusters for outlier detection. The proposed iterative learning process with robust distance estimation can address the high dimensional problems of spatial attributes and accurately detect spatial outliers with irregular features. To verify the efficiency and robustness of our proposed algorithm, comparative study of ISOMRD and several existing approaches are presented in detail. Specifically, we test the performance of our method based on four real-world spatial datasets. Various simulation results demonstrate the effectiveness of the proposed approach.

[1]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[2]  Haibo He,et al.  IterativeSOMSO: An Iterative Self-organizing Map for Spatial Outlier Detection , 2010, ISNN.

[3]  Chang-Tien Lu,et al.  Detecting spatial outliers with multiple attributes , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[4]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[5]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[6]  Chang-Tien Lu,et al.  On Detecting Spatial Outliers , 2008, GeoInformatica.

[7]  Raymond T. Ng,et al.  A Unified Notion of Outliers: Properties and Computation , 1997, KDD.

[8]  John F. Roddick,et al.  A bibliography of temporal , 1999 .

[9]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[10]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[11]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[12]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[13]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[14]  John A. Richards,et al.  Fast k-NN classification using the cluster-space approach , 2005, IEEE Geoscience and Remote Sensing Letters.

[15]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[16]  L. Anselin Local Indicators of Spatial Association—LISA , 2010 .

[17]  Jan M. Zytkow,et al.  Unified Algorithm for Undirected Discovery of Execption Rules , 2000, PKDD.

[18]  L. Amelin,et al.  Local Indicators of Spatial Association-LISA , 1995 .

[19]  A. Meredith,et al.  UNITED STATES DEPARTMENT OF COMMERCE , 1999 .

[20]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[21]  Ralf Hartmut Güting Dr.rer.nat An introduction to spatial database systems , 2005, The VLDB Journal.

[22]  David M. Rocke,et al.  The Distribution of Robust Distances , 2005 .

[23]  Nimrod Megiddo,et al.  Discovery-Driven Exploration of OLAP Data Cubes , 1998, EDBT.

[24]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[25]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[26]  John F. Roddick,et al.  A bibliography of temporal, spatial and spatio-temporal data mining research , 1999, SKDD.

[27]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[28]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[29]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[30]  A. Madansky Identification of Outliers , 1988 .

[31]  Sean Fisk,et al.  Tour of Spatial Databases , 2013 .

[32]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[33]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[34]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[35]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[36]  Hans-Peter Kriegel,et al.  Clustering for Mining in Large Spatial Databases , 1998, Künstliche Intell..

[37]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[38]  Jan M. Zytkow,et al.  Unified algorithm for undirected discovery of exception rules , 2005, Int. J. Intell. Syst..

[39]  Mia Hubert,et al.  LIBRA: a MATLAB library for robust analysis , 2005 .

[40]  Haibo He,et al.  SOMSO: A self-organizing map approach for spatial outlier detection with multiple attributes , 2009, 2009 International Joint Conference on Neural Networks.

[41]  Shashi Shekhar,et al.  A Unified Approach to Detecting Spatial Outliers , 2003, GeoInformatica.

[42]  Sanjay Chawla,et al.  On local spatial outliers , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[43]  C. Lu A Uniied Approach to Spatial Outliers Detection , 2003 .

[44]  P. Rousseeuw,et al.  Computing depth contours of bivariate point clouds , 1996 .

[45]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[46]  Hendrik Blockeel,et al.  Knowledge Discovery in Databases: PKDD 2003 , 2003, Lecture Notes in Computer Science.

[47]  D. Griffith Spatial Autocorrelation , 2020, Spatial Analysis Methods and Practice.

[48]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[49]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[50]  Jiawei Han,et al.  Spatial Data Mining: Progress and Challenges , 1996, Workshop on Research Issues on Data Mining and Knowledge Discovery.

[51]  Chang-Tien Lu,et al.  Algorithms for spatial outlier detection , 2003, Third IEEE International Conference on Data Mining.

[52]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).