Data lineage and information density in database visualization

Visual representations of data help users interpret and analyze information. We have identified two key issues in existing visualization systems: data lineage and information density. This dissertation defines these problems and details solutions for them. We show that our techniques can be applied in database visualization systems, and we discuss how they improve the usability of these systems. The data lineage problem occurs when users apply a sequence of processing steps to input data sources; when viewing the final result, these users may wish to trace certain elements in the result back to the original input items. We call these types of queries data lineage queries. Current systems, e.g., geographic information systems or scientific visualization systems, provide little support for this task. In the first part of this dissertation, we discuss techniques for allowing users to access intermediate results efficiently while performing data lineage queries. We then introduce weak inversion and verification and show how they can be used to reconstruct the (approximate) lineage of derived data. Because they eliminate much of the irrelevant source data, weak inversion and verification can greatly reduce the amount of source data the end user must examine while performing a data lineage query. Visualizations often display too much information, making it difficult for users to interpret them. Similarly, visualizations often display too little information, thereby underutilizing display space. In the second part of this dissertation, we describe the general principle of constant information density. We show how both semantic and spatial transformations based on constant information density can be applied to create visualizations with appropriate density, thereby minimizing clutter and sparseness in the display. We describe an end-user programming environment in which users can construct visualizations with constant information density.

[1]  Michael Stonebraker,et al.  Tioga-2: a direct manipulation database visualization environment , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[2]  Jade Goldstein-Stewart,et al.  Interactive graphic design using automatic presentation knowledge , 1994, CHI '94.

[3]  Michael Stonebraker,et al.  Constant density visualizations of non-uniform distributions of data , 1998, UIST '98.

[4]  R. Phillips,et al.  An Investigation of Visual Clutter in the Topographic Base of a Geological Map , 1982 .

[5]  Michael Stonebraker,et al.  BigSur: A System For the Management of Earth Science Data , 1995, VLDB.

[6]  J. Chen,et al.  Zooming and tunneling in Tioga: supporting navigation in multidimensional space , 1994, Proceedings of 1994 IEEE Symposium on Visual Languages.

[7]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[8]  Wilbert O. Galitz,et al.  User-Interface Screen Design , 1993 .

[9]  R. McMaster,et al.  Map Generalization: Making Rules for Knowledge Representation , 1991 .

[10]  Carla J. Springer,et al.  Retrieval of Information from Complex Alphanumeric Displays: Screen Formatting Variables' Effects on Target Identification Time , 1987, HCI.

[11]  Tony DeRose,et al.  Toolglass and magic lenses: the see-through interface , 1993, SIGGRAPH.

[12]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[13]  George C. Polyzos,et al.  SEQUOIA 2000 LARGE CAPACITY OBJECT SERVERS TO SUPPORT GLOBAL CHANGE RESEARCH , 1997 .

[14]  Michael Stonebraker,et al.  Supporting fine-grained data lineage in a database visualization environment , 1997, Proceedings 13th International Conference on Data Engineering.

[15]  Michael Stonebraker,et al.  Goal-directed zoom , 1998, CHI Conference Summary.

[16]  Toshiaki Yasue,et al.  An environment for dataflow program development of parallel processing system-harray , 1991, Systems and Computers in Japan.

[17]  Eben M. Haber,et al.  User-oriented visual layout at multiple granularities , 1996, AVI '96.

[18]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[19]  James D. Hollan,et al.  Pad++: a zooming graphical interface for exploring alternate interface physics , 1994, UIST '94.

[20]  Peter J. Denning,et al.  Virtual memory , 1970, CSUR.

[21]  Andrew U. Frank,et al.  Multiple representations for cartographic objects in a multi-scale tree - An intelligent graphical zoom , 1994, Comput. Graph..

[22]  Edward Rolf Tufte,et al.  The visual display of quantitative information , 1985 .

[23]  Michael Stonebraker,et al.  The POSTGRES next generation database management system , 1991, CACM.

[24]  StonebrakerMichael,et al.  The POSTGRES next generation database management system , 1991 .

[25]  D. Lanter Design of a Lineage-Based Meta-Data Base for GIS , 1991 .

[26]  Maureen C. Stone,et al.  Enhanced dynamic queries via movable filters , 1995, CHI '95.

[27]  David H. Laidlaw,et al.  The application visualization system: a computational environment for scientific visualization , 1989, IEEE Computer Graphics and Applications.

[28]  Joseph M. Hellerstein,et al.  CONTROL: continuous output and navigation technology with refinement on-line , 1998, SIGMOD '98.

[29]  Benjamin B. Bederson,et al.  Space-scale diagrams: understanding multiscale interfaces , 1995, CHI '95.

[30]  Ben Shneiderman,et al.  Visual information seeking: tight coupling of dynamic query filters with starfield displays , 1994, CHI Conference Companion.

[31]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[32]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[33]  Robert Wilensky,et al.  Designing graphic presentations from first principles , 1998 .

[34]  Michael Stonebraker,et al.  Constant information density in zoomable interfaces , 1998, AVI '98.

[35]  James D. Hollan,et al.  A zooming Web browser , 1996 .

[36]  Michael Stonebraker,et al.  VIQING: visual interactive querying , 1998, Proceedings. 1998 IEEE Symposium on Visual Languages (Cat. No.98TB100254).

[37]  Margaret M. Burnett,et al.  Visual Programming , 1995 .

[38]  Michael M. Gorlick,et al.  Using weaves for software construction and analysis , 1991, [1991 Proceedings] 13th International Conference on Software Engineering.

[39]  Richard R. Muntz,et al.  Extracting spatio-temporal patterns from geoscience datasets , 1994, Proceedings of Workshop on Visualization and Machine Vision.

[40]  Ken Perlin,et al.  Pad: an alternative approach to the computer interface , 1993, SIGGRAPH.