Pixel based visual data mining of geo-spatial data

Abstract In many application domains, data is collected and referenced by its geo-spatial location. Spatial data mining, or the discovery of interesting patterns in such databases, is an important capability in the development of database systems. A noteworthy trend is the increasing size of data sets in common use, such as records of business transactions, environmental data and census demographics. These data sets often contain millions of records, or even far more. This situation creates new challenges in coping with scale. For data mining of large data sets to be effective, it is also important to include humans in the data exploration process and combine their flexibility, creativity, and general knowledge with the enormous storage capacity and computational power of today's computers. Visual data mining applies human visual perception to the exploration of large data sets. Presenting data in an interactive, graphical form often fosters new insights, encouraging the formation and validation of new hypotheses to the end of better problem-solving and gaining deeper domain knowledge. In this paper we give a short overview of visual data mining techniques, especially for analyzing geo-spatial data. We provide examples for effective visualizations of geo-spatial data in important application areas such as consumer analysis and census demographics.

[1]  Eckart Zitzler,et al.  Evolutionary algorithms for multiobjective optimization: methods and applications , 1999 .

[2]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[3]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[4]  Donald H. House,et al.  Continuous cartogram construction , 1998 .

[5]  Mitsuo Gen,et al.  Genetic algorithms and engineering optimization , 1999 .

[6]  Lothar Thiele,et al.  Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case Study , 1998, PPSN.

[7]  Joachim Stender,et al.  Parallel Genetic Algorithms: Introduction and Overview of Current Research , 1993 .

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  Daniel A. Keim,et al.  Visual exploration of large telecommunication data sets , 1999, Proceedings User Interfaces to Data Intensive Systems.

[10]  Daniel A. Keim,et al.  PixelMaps: a new visual data mining approach for analyzing large spatial data sets , 2003, Third IEEE International Conference on Data Mining.

[11]  Jiawei Han,et al.  Spatial Data Mining: Progress and Challenges , 1996, Workshop on Research Issues on Data Mining and Knowledge Discovery.

[12]  Heidrun Schumann,et al.  Visualisierung - Grundlagen und allgemeine Methoden , 2000 .

[13]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[14]  Peter A. Rogerson,et al.  Spatial Analysis and GIS , 1994 .

[15]  Daniel A. Keim,et al.  Efficient cartogram generation: a comparison , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[16]  Daniel A. Keim,et al.  Visualizing Geographic Information: VisualPoints vs CartoDraw , 2003, Inf. Vis..

[17]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[18]  Daniel A. Keim,et al.  The Gridfit algorithm: an efficient and effective approach to visualizing large amo , 1998 .

[19]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .