Consistent thinning of large geographical data for map visualization

Large-scale map visualization systems play an increasingly important role in presenting geographic datasets to end-users. Since these datasets can be extremely large, a map rendering system often needs to select a small fraction of the data to visualize them in a limited space. This article addresses the fundamental challenge of thinning: determining appropriate samples of data to be shown on specific geographical regions and zoom levels. Other than the sheer scale of the data, the thinning problem is challenging because of a number of other reasons: (1) data can consist of complex geographical shapes, (2) rendering of data needs to satisfy certain constraints, such as data being preserved across zoom levels and adjacent regions, and (3) after satisfying the constraints, an optimal solution needs to be chosen based on objectives such as maximality, fairness, and importance of data. This article formally defines and presents a complete solution to the thinning problem. First, we express the problem as an integer programming formulation that efficiently solves thinning for desired objectives. Second, we present more efficient solutions for maximality, based on DFS traversal of a spatial tree. Third, we consider the common special case of point datasets, and present an even more efficient randomized algorithm. Fourth, we show that contiguous regions are tractable for a general version of maximality for which arbitrary regions are intractable. Fifth, we examine the structure of our integer programming formulation and show that for point datasets, our program is integral. Finally, we have implemented all techniques from this article in Google Maps [Google 2005] visualizations of fusion tables [Gonzalez et al. 2010], and we describe a set of experiments that demonstrate the trade-offs among the algorithms.

[1]  G. Powell,et al.  Terrestrial Ecoregions of the World: A New Map of Life on Earth , 2001 .

[2]  K. S. Shea,et al.  Cartographic generalization in a digital environment: when and how to generalize , 1989 .

[3]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[4]  Cong Yu,et al.  Computational Journalism: A Call to Arms to Database Researchers , 2011, CIDR.

[5]  Michael Stonebraker,et al.  Constant information density in zoomable interfaces , 1998, AVI '98.

[6]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[7]  D. Hilbert Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .

[8]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[9]  Giuliana Dettori,et al.  Towards a Formal Model for Multi-Resolution Spatial Maps , 1995, SSD.

[10]  Sriram Subramanian,et al.  Talking about tactile experiences , 2013, CHI.

[11]  Andrew U. Frank,et al.  Multiple representations for cartographic objects in a multi-scale tree - An intelligent graphical zoom , 1994, Comput. Graph..

[12]  Chee-Keng Yap,et al.  Dynamic Map Labeling , 2006, IEEE Transactions on Visualization and Computer Graphics.

[13]  Pat Hanrahan,et al.  Maintaining interactivity while exploring massive time series , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[14]  I. J. Schoenberg,et al.  The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[15]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[16]  Alan J. Dix,et al.  Statistical , 2018, The War of Words.

[17]  Lutz Plümer,et al.  FAST SCREEN MAP LABELING – DATA-STRUCTURES AND ALGORITHMS , 2003 .

[18]  H. Sagan Space-filling curves , 1994 .

[19]  Anthony K. H. Tung,et al.  Spatial clustering methods in data mining : A survey , 2001 .

[20]  Christopher B. Jones,et al.  Automated map generalization with multiple operators: a simulated annealing approach , 2003, Int. J. Geogr. Inf. Sci..

[21]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[22]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[23]  Nectaria Tryfona,et al.  Consistency among parts and aggregates: A computational model , 1996, Trans. GIS.

[24]  D. Hilbert Über die stetige Abbildung einer Linie auf ein Flächenstück , 1935 .

[25]  R. Phillips,et al.  An Investigation of Visual Clutter in the Topographic Base of a Geological Map , 1982 .

[26]  Monica M. C. Schraefel,et al.  Trust me, i'm partially right: incremental visualization lets analysts explore large datasets faster , 2012, CHI.

[27]  Stéphane Grumbach,et al.  The DEDALE system for complex spatial queries , 1998, SIGMOD '98.

[28]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[29]  Jayant Madhavan,et al.  Efficient spatial sampling of large geographical tables , 2012, SIGMOD Conference.

[30]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[31]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[32]  HalevyAlon,et al.  Consistent thinning of large geographical data for map visualization , 2013 .

[33]  Pat Hanrahan,et al.  Multiscale Visualization Using Data Cubes , 2003, IEEE Trans. Vis. Comput. Graph..

[34]  Alexander Wolff,et al.  Optimizing active ranges for consistent dynamic map labeling , 2010, Comput. Geom..

[35]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[36]  Christian S. Jensen,et al.  Google fusion tables: web-centered data management and collaboration , 2010, SIGMOD Conference.

[37]  Wolfgang Berger,et al.  A Multi-Threading Architecture to Support Interactive Visual Exploration , 2009, IEEE Transactions on Visualization and Computer Graphics.

[38]  Building a scalable geo-spatial dbms: Technology, implementation, and evaluation , 1997 .

[39]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.