by chance enhancing interaction with large data sets through statistical sampling

The use of random algorithms in many areas of computer science has enabled the solution of otherwise intractable problems. In this paper we propose that random sampling can make the visualisation of large datasets both more computationally efficient and more perceptually effective. We review the explicit uses of randomness and the related deterministic techniques in the visualisation literature. We then discuss how sampling can augment existing systems. Furthermore, we demonstrate a novel 2D zooming interface - the Astral Telescope Visualiser, a visualisation suggested and enabled by sampling. We conclude by considering some general usability and technical issues raised by sampling-based visualisation.

[1]  Gregory Piatetsky-Shapiro,et al.  Accurate estimation of the number of tuples satisfying a condition , 1984, SIGMOD '84.

[2]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[3]  Doron Rotem,et al.  Simple Random Sampling from Relational Databases , 1986, VLDB.

[4]  Ben Shneiderman,et al.  Designing the user interface (videotape) , 1987 .

[5]  B. Shneiderman,et al.  The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploration system , 1992, SIGIR '92.

[6]  David Benyon Task Analysis and System Design: The Discipline of Data , 1992, Interact. Comput..

[7]  David Benyon,et al.  The Role of Task Analysis in Systems Design , 1992, Interact. Comput..

[8]  X. Lin,et al.  Visualization for the document space , 1992, Proceedings Visualization '92.

[9]  Alan Dix Human issues in the use of pattern recognition techniques , 1992 .

[10]  Pace and interaction , 1993 .

[11]  Frank Olken,et al.  Random Sampling from Databases , 1993 .

[12]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[13]  Alan J. Dix,et al.  Query by Browsing , 1994, IDS.

[14]  Steve Benford,et al.  Virtual Environments for Data Sharing and Visualisation - Populated Information Terrains , 1994, IDS.

[15]  R Steele What is CDMA , 1994 .

[16]  James D. Hollan,et al.  Pad++: a zooming graphical interface for exploring alternate interface physics , 1994, UIST '94.

[17]  Ben Shneiderman,et al.  Visual information seeking: tight coupling of dynamic query filters with starfield displays , 1994, CHI '94.

[18]  David Williams,et al.  The attribute explorer , 1994, CHI '94.

[19]  Hans-Peter Kriegel,et al.  VisDB: database exploration using multidimensional visualization , 1994, IEEE Computer Graphics and Applications.

[20]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.

[21]  Ramana Rao,et al.  The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information , 1994, CHI '94.

[22]  Janet Finlay,et al.  HIBROWSE for Hotels: Bridging the Gap Between User and System Views of a Database , 1994, IDS.

[23]  Robert Spence,et al.  The influence explorer , 1995, CHI '95.

[24]  Russell Beale,et al.  Case study. Narcissus: visualising information , 1995, Proceedings of Visualization 1995 Conference.

[25]  Ramana Rao,et al.  Visualizing large trees using the hyperbolic browser , 1996, CHI Conference Companion.

[26]  Robert Spence,et al.  Externalising abstract mathematical models , 1996, CHI '96.

[27]  Marti A. Hearst,et al.  Scatter/gather browsing communicates the topic structure of a very large text collection , 1996, CHI.

[28]  Peter Pirolli,et al.  Computational models of information scent-following in a very large browsable text collection , 1997, CHI.

[29]  Matthew Chalmers,et al.  Domesticating Bead: adapting an information visualization system to a financial institution , 1997, Proceedings of VIZ '97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium.

[30]  Lisa Tweedie,et al.  Characterizing interactive externalizations , 1997, CHI.

[31]  Xia Lin,et al.  Map Displays for Information Retrieval , 1997, J. Am. Soc. Inf. Sci..

[32]  Michael Stonebraker,et al.  Constant information density in zoomable interfaces , 1998, AVI '98.

[33]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[34]  Rajeev Raman,et al.  Random Sampling Techniques in Parallel Computation , 1998, IPPS/SPDP Workshops.

[35]  Daniel A. Keim,et al.  The Gridfit algorithm: an efficient and effective approach to visualizing large amounts of spatial data , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[36]  Rajeev Motwani,et al.  Random sampling for histogram construction: how much is enough? , 1998, SIGMOD '98.

[37]  Michael Stonebraker,et al.  Constant density visualizations of non-uniform distributions of data , 1998, UIST '98.

[38]  Heidrun Schumann,et al.  Information visualization using a new focus+context technique in combination with dynamic clustering of information space , 1999, NPIVM '99.

[39]  Bruce G. Lindsay,et al.  Random sampling techniques for space efficient online computation of order statistics of large datasets , 1999, SIGMOD '99.

[40]  Balint Hegedüs Information Visualisation , 2022, Encyclopedia of Big Data.

[41]  Oscar de Bruijn,et al.  Rapid serial visual presentation: a space-time trade-off in information presentation , 2000, AVI '00.

[42]  Matthew Chalmers Informatics, Architecture and Language , 2003, Designing Information Spaces.

[43]  Ben Shneiderman,et al.  Designing The User Interface , 2013 .

[44]  CDMA – Code Division Multiple Access , .

[45]  Digital Watermarking: A Tutorial , .