A Probabilistic Grammar of Graphics

Visualizations depicting probabilities and uncertainty are used everywhere from medical risk communication to machine learning, yet these probabilistic visualizations are difficult to specify, prone to error, and their designs are cumbersome to explore. We propose a Probabilistic Grammar of Graphics (PGoG), an extension to Wilkinson's original framework. Inspired by the success of probabilistic programming languages, PGoG makes probability expressions, such as P(A|B), a first-class citizen in the language. PGoG abstractions also reflect the distinction between probability and frequency framing, a concept from the uncertainty communication literature. It is expressive, encompassing product plots, density plots, icon arrays, and dotplots, among other visualizations. Its coherent syntax ensures correctness (that the proportions of visual elements and their spatial placement reflect the underlying probability distribution) and reduces edit distance between probabilistic visualization specifications, potentially supporting more design exploration. We provide a proof-of-concept implementation of PGoG in R.

[1]  Dongyu Liu,et al.  TPFlow: Progressive Partition and Multidimensional Pattern Extraction for Large-Scale Spatio-Temporal Data Analysis , 2019, IEEE Transactions on Visualization and Computer Graphics.

[2]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[3]  Tamara Munzner,et al.  Process and Pitfalls in Writing Information Visualization Research Papers , 2008, Information Visualization.

[4]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[5]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.

[6]  M. Elisabeth Paté-Cornell,et al.  Uncertainties in risk analysis: Six levels of treatment , 1996 .

[7]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[8]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[9]  Sean A. Munson,et al.  Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making , 2018, CHI.

[10]  Mark Bailey,et al.  The Grammar of Graphics , 2007, Technometrics.

[11]  M. Goodchild,et al.  Uncertainty in geographical information , 2002 .

[12]  D. Spiegelhalter Risk and Uncertainty Communication , 2017 .

[13]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[14]  Niklas Elmqvist,et al.  Atom: A Grammar for Unit Visualizations , 2018, IEEE Transactions on Visualization and Computer Graphics.

[15]  Alexei Tsvetkov,et al.  the needle , 2009 .

[16]  David R. Cox,et al.  Some Remarks on the Role in Statistics of Graphical Methods , 1978 .

[17]  B. Fischhoff,et al.  Communicating Risks and Benefits: An Evidence Based User's Guide , 2012 .

[18]  P. Ubel,et al.  The impact of the format of graphical presentation on health-related knowledge and treatment choices. , 2008, Patient education and counseling.

[19]  Younghoon Kim,et al.  GraphScape: A Model for Automated Reasoning about Visualization Similarity and Sequencing , 2017, CHI.

[20]  Walter R. Gilks,et al.  BUGS - Bayesian inference Using Gibbs Sampling Version 0.50 , 1995 .

[21]  Ben Shneiderman,et al.  Interactive Dynamics for Visual Analysis , 2012 .

[22]  John K. Kruschke,et al.  Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan , 2014 .

[23]  Hadley Wickham,et al.  A Layered Grammar of Graphics , 2010 .

[24]  Jeffrey Heer,et al.  Value-Suppressing Uncertainty Palettes , 2018, CHI.

[25]  Dinesh Manocha,et al.  Heter-Sim: Heterogeneous Multi-Agent Systems Simulation by Interactive Data-Driven Optimization , 2018, IEEE Transactions on Visualization and Computer Graphics.

[26]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[27]  Hadley Wickham,et al.  Product Plots , 2011, IEEE Transactions on Visualization and Computer Graphics.

[28]  Mark Gahegan,et al.  A typology for visualizing uncertainty , 2005, IS&T/SPIE Electronic Imaging.

[29]  Jurriaan P Oudhoff,et al.  The Effect of Different Graphical and Numerical Likelihood Formats on Perception of Likelihood and Choice , 2015, Medical decision making : an international journal of the Society for Medical Decision Making.

[30]  Georg Bruckmaier,et al.  Effects of visualizing statistical information – an empirical study on tree diagrams and 2 × 2 tables , 2015, Front. Psychol..

[31]  Sean A. Munson,et al.  When (ish) is My Bus?: User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems , 2016, CHI.

[32]  Maria Kutar,et al.  Cognitive Dimensions of Notations: Design Tools for Cognitive Technology , 2001, Cognitive Technology.

[33]  Hongyuan Zha,et al.  Visualizing Uncertainty and Alternatives in Event Sequence Predictions , 2019, CHI.

[34]  Claudio V. Russo,et al.  Tabular: a schema-driven probabilistic programming language , 2014, POPL.

[35]  Bongshin Lee,et al.  Charticulator: Interactive Construction of Bespoke Chart Layouts , 2019, IEEE Transactions on Visualization and Computer Graphics.

[36]  J. Kruschke Chapter 8 – JAGS , 2015 .

[37]  Gerd Gigerenzer,et al.  Effect of Tabular and Icon Fact Box Formats on Comprehension of Benefits and Harms of Prostate Cancer Screening: A Randomized Trial , 2019, Medical decision making : an international journal of the Society for Medical Decision Making.

[38]  Dan Morris,et al.  There's no such thing as gaining a pound: reconsidering the bathroom scale user interface , 2013, UbiComp.

[39]  Thomas A. Henzinger,et al.  Probabilistic programming , 2014, FOSE.

[40]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[41]  Brian Johnson,et al.  TreeViz: treemap visualization of hierarchically structured information , 1992, CHI.

[42]  Donald H. House,et al.  Visualizing Uncertain Tropical Cyclone Predictions using Representative Samples from Ensembles of Forecast Tracks , 2019, IEEE Transactions on Visualization and Computer Graphics.

[43]  Gerd Gigerenzer,et al.  How to Improve Bayesian Reasoning Without Instruction: Frequency Formats , 1995 .

[44]  P. Resnick,et al.  Hypothetical Outcome Plots Outperform Error Bars and Violin Plots for Inferences about Reliability of Variable Ordering , 2015, PloS one.

[45]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[46]  G. Gigerenzer,et al.  Using icon arrays to communicate medical risks: overcoming low numeracy. , 2009, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[47]  Carlos Eduardo Scheidegger,et al.  An Algebraic Process for Visualization Design , 2014, IEEE Transactions on Visualization and Computer Graphics.