Background on Spatial Data Management and Exploration

Before studying mobility data, we have to make a short tour at the (stationary) spatial domain. For decades, spatial information has been studied thoroughly; from Cartography and Geodesy to Geographical Information Systems (GIS) and Spatial Database Management Systems (SDBMS); this is justified due to its importance and ubiquity in our everyday lives. Database community has followed the paradigm of extended DBMS and provided inherent spatial functionality in geographical data collections by developing spatial data types, operators and methods for querying, as well as indexing techniques. At the exploration level, multi-dimensional online analytical processing (OLAP) and knowledge discovery in databases (KDD) have attracted excellent results at the spatial domain. In this chapter, we review spatial database management (modeling, indexing, query processing) and exploration aspects (data warehousing and OLAP analysis, data mining), followed by a short discussion on data privacy aspects. This is essential knowledge in order for the reader to get familiar with background terms and notions during the corresponding discussion in the mobility data domain, in the chapters that will follow.

[1]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[3]  Cristina Nita-Rotaru,et al.  A survey of attack and defense techniques for reputation systems , 2009, CSUR.

[4]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[5]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[6]  Yannis Manolopoulos,et al.  Performance of Nearest Neighbor Queries in R-Trees , 1997, ICDT.

[7]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[8]  Dimitris Papadias,et al.  Processing and optimization of multiway spatial joins using R-trees , 1999, PODS '99.

[9]  Jiawei Han,et al.  Selective Materialization: An Efficient Method for Spatial Data Cube Construction , 1998, PAKDD.

[10]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[11]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[12]  Cyrus Shahabi,et al.  VoR-tree , 2010, Proc. VLDB Endow..

[13]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[14]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[15]  N. Andrienko,et al.  Basic Concepts of Movement Data , 2008, Mobility, Data Mining and Privacy.

[16]  Dimitris Papadias,et al.  Constraint-Based Processing of Multiway Spatial Joins , 2001, Algorithmica.

[17]  Max J. Egenhofer,et al.  Reasoning about Binary Topological Relations , 1991, SSD.

[18]  Yan Huang,et al.  Spatial Data Mining , 2010, Data Mining and Knowledge Discovery Handbook.

[19]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[20]  Dino Pedreschi,et al.  Mobility, Data Mining and Privacy - Geographic Knowledge Discovery , 2008, Mobility, Data Mining and Privacy.

[21]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[22]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[23]  Max J. Egenhofer,et al.  A Formal Definition of Binary Topological Relationships , 1989, FODO.

[24]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[25]  Yannis Manolopoulos,et al.  Closest pair queries in spatial databases , 2000, SIGMOD 2000.

[26]  Ken C. K. Lee,et al.  Nearest Surrounder Queries , 2010, IEEE Trans. Knowl. Data Eng..

[27]  Yannis Manolopoulos,et al.  C2P: Clustering based on Closest Pairs , 2001, VLDB.

[28]  Shashi Shekhar,et al.  Spatial Databases: A Tour , 2003 .

[29]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[30]  M. Egenhofer Categorizing Binary Topological Relations Between Regions, Lines, and Points in Geographic Databases , 1998 .

[31]  Dimitris Papadias,et al.  Multiway spatial joins , 2001, ACM Trans. Database Syst..

[32]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[33]  Timos K. Sellis,et al.  Efficient Cost Models for Spatial Queries Using R-Trees , 2000, IEEE Trans. Knowl. Data Eng..

[34]  Divyakant Agrawal,et al.  Constrained Nearest Neighbor Queries , 2001, Encyclopedia of GIS.

[35]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[36]  Yufei Tao,et al.  All-nearest-neighbors queries in spatial databases , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[37]  Esteban Zimányi,et al.  Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications , 2010 .

[38]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[39]  Panos Kalnis,et al.  Efficient OLAP Operations in Spatial Data Warehouses , 2001, SSTD.

[40]  Timos K. Sellis,et al.  Topological relations in the world of minimum bounding rectangles: a study with R-trees , 1995, SIGMOD '95.

[41]  Stefan Berchtold,et al.  Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..

[42]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[43]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[44]  Fadi Deeb Discovering Spatial Co-Location Patterns , 2010 .

[45]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[46]  Yan Huang,et al.  Discovering Spatial Co-location Patterns: A Summary of Results , 2001, SSTD.

[47]  Dimitris Papadias,et al.  Spatial Relations, Minimum Bounding Rectangles, and Spatial Data Structures , 1997, Int. J. Geogr. Inf. Sci..

[48]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[49]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[50]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[51]  Yannis Manolopoulos,et al.  Indexed-based density biased sampling for clustering applications , 2006, Data Knowl. Eng..

[52]  Yannis Manolopoulos,et al.  Cost models for distance joins queries using R-trees , 2006, Data Knowl. Eng..

[53]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[54]  Ada Wai-Chee Fu,et al.  Enhanced nearest neighbour search on the R-tree , 1998, SGMD.

[55]  Anthony K. H. Tung,et al.  Spatial clustering in the presence of obstacles , 2001, Proceedings 17th International Conference on Data Engineering.

[56]  Torben Bach Pedersen,et al.  Nearest neighbor queries in road networks , 2003, GIS '03.

[57]  Yannis Manolopoulos,et al.  R-Trees: Theory and Applications , 2005, Advanced Information and Knowledge Processing.

[58]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[59]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .