Validation of Visual Statistical Inference, Applied to Linear Models

Statistical graphics play a crucial role in exploratory data analysis, model checking, and diagnosis. The lineup protocol enables statistical significance testing of visual findings, bridging the gulf between exploratory and inferential statistics. In this article, inferential methods for statistical graphics are developed further by refining the terminology of visual inference and framing the lineup protocol in a context that allows direct comparison with conventional tests in scenarios when a conventional test exists. This framework is used to compare the performance of the lineup protocol against conventional statistical testing in the scenario of fitting linear models. A human subjects experiment is conducted using simulated data to provide controlled conditions. Results suggest that the lineup protocol performs comparably with the conventional tests, and expectedly outperforms them when data are contaminated, a scenario where assumptions required for performing a conventional test are violated. Surprisingly, visual tests have higher power than the conventional tests when the effect size is large. And, interestingly, there may be some super-visual individuals who yield better performance and power than the conventional test even in the most difficult tasks. Supplementary materials for this article are available online.

[1]  David K. Simkin,et al.  An Information-Processing Analysis of Graph Perception , 1987 .

[2]  A. Buja,et al.  Calibration for Simultaneity : ( Re ) Sampling Methods for Simultaneous Inference with Applications to Function Estimation and Functional Data , 2022 .

[3]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[4]  Deborah F. Swayne,et al.  Statistical inference for exploratory data analysis and model diagnostics , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[5]  W. Cleveland,et al.  Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods , 1984 .

[6]  S. Lewandowsky,et al.  Displaying proportions and percentages , 1991 .

[7]  Heike Hofmann,et al.  Graphical Tests for Power Comparison of Competing Designs , 2012, IEEE Transactions on Visualization and Computer Graphics.

[8]  Dianne Cook,et al.  Mind Reading: Using an Eye-Tracker to See How People are Looking at Lineups , 2013 .

[9]  Frederick Mosteller,et al.  A $k$-Sample Slippage Test for an Extreme Population , 1948 .

[10]  Mark Bailey,et al.  The Grammar of Graphics , 2007, Technometrics.

[11]  Jeffrey Heer,et al.  Crowdsourcing graphical perception: using mechanical turk to assess visualization design , 2010, CHI.

[12]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[13]  G Turk,et al.  The Mechanical Turk , 2015 .

[14]  Terry Engelberg,et al.  The effects of social support on children's eyewitness testimony , 1992 .

[15]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[16]  M. Stephens,et al.  The Distribution of a Sum of Binomial Random Variables , 1993 .

[17]  J. Brian Gray,et al.  Applied Regression Including Computing and Graphics , 1999, Technometrics.

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .