Impacts of the Numbers of Colors and Shapes on Outlier Detection: from Automated to User Evaluation

The design of efficient representations is well established as a fruitful way to explore and analyze complex or large data. In these representations, data are encoded with various visual attributes depending on the needs of the representation itself. To make coherent design choices about visual attributes, the visual search field proposes guidelines based on the human brain perception of features. However, information visualization representations frequently need to depict more data than the amount these guidelines have been validated on. Since, the information visualization community has extended these guidelines to a wider parameter space. This paper contributes to this theme by extending visual search theories to an information visualization context. We consider a visual search task where subjects are asked to find an unknown outlier in a grid of randomly laid out distractor. Stimuli are defined by color and shape features for the purpose of visually encoding categorical data. The experimental protocol is made of a parameters space reduction step (i.e., sub-sampling) based on a machine learning model, and a user evaluation to measure capacity limits and validate hypotheses. The results show that the major difficulty factor is the number of visual attributes that are used to encode the outlier. When redundantly encoded, the display heterogeneity has no effect on the task. When encoded with one attribute, the difficulty depends on that attribute heterogeneity until its capacity limit (7 for color, 5 for shape) is reached. Finally, when encoded with two attributes simultaneously, performances drop drastically even with minor heterogeneity.

[1]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .

[2]  Ali Borji,et al.  Understanding and Visualizing Deep Visual Saliency Models , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael S. Bernstein,et al.  Learning Perceptual Kernels for Visualization Design , 2014, IEEE Transactions on Visualization and Computer Graphics.

[4]  Huamin Qu,et al.  Evaluating the Readability of Force Directed Graph Layouts: A Deep Learning Approach , 2018, IEEE Computer Graphics and Applications.

[5]  Daniel Zwillinger,et al.  CRC Standard Probability and Statistics Tables and Formulae, Student Edition , 1999 .

[6]  A Treisman,et al.  Feature analysis in early vision: evidence from search asymmetries. , 1988, Psychological review.

[7]  James T. Enns,et al.  Attention and Visual Memory in Visualization and Computer Graphics , 2012, IEEE Transactions on Visualization and Computer Graphics.

[8]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[9]  Ulrik Brandes,et al.  Quality Metrics for Information Visualization , 2018, Comput. Graph. Forum.

[10]  N. Camgoz,et al.  Effects of Hue, Saturation, and Brightness on Preference , 2002 .

[11]  W. Cowan,et al.  Visual search for colour targets that are or are not linearly separable from distractors , 1996, Vision Research.

[12]  Theo van Walsum,et al.  Iconic techniques for feature visualization , 1995, Proceedings Visualization '95.

[13]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Cynthia A. Brewer,et al.  ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps , 2003 .

[16]  Jacques Bertin,et al.  Semiology of Graphics - Diagrams, Networks, Maps , 2010 .

[17]  Steven Franconeri,et al.  Redundant Encoding Strengthens Segmentation and Grouping in Visual Displays of Data , 2017, Journal of experimental psychology. Human perception and performance.

[18]  Christopher G. Healey,et al.  Visualizing data with motion , 2005, VIS 05. IEEE Visualization, 2005..

[19]  Herman Chernoff,et al.  The Use of Faces to Represent Points in k- Dimensional Space Graphically , 1973 .

[20]  R. Bourqui,et al.  Toward automatic comparison of visualization techniques: Application to graph visualization , 2020, Vis. Informatics.

[21]  N. Camgoz,et al.  Effects of Hue, Saturation, and Brightness: Part 2: Attention. , 2004 .

[22]  G W Humphreys,et al.  Visual search for targets defined by combinations of color, shape, and size: An examination of the task constraints on feature and conjunction searches , 1987, Perception & psychophysics.

[23]  Anne Treisman,et al.  Feature Analysis in Early Vision , 2012 .

[24]  Helen C. Purchase,et al.  Experimental Human-Computer Interaction - A Practical Guide with Visual Examples , 2012 .

[25]  Cenk Sokmensuer,et al.  Color Graphs for Automated Cancer Diagnosis and Grading , 2010, IEEE Transactions on Biomedical Engineering.

[26]  J. Duncan,et al.  Visual search and stimulus similarity. , 1989, Psychological review.

[27]  Hanspeter Pfister,et al.  Evaluating ‘Graphical Perception’ with CNNs , 2018, IEEE Transactions on Visualization and Computer Graphics.

[28]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[29]  Hong Zhou,et al.  Visual Clustering in Parallel Coordinates , 2008, Comput. Graph. Forum.

[30]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[31]  Radu Jianu,et al.  Node-Link or Adjacency Matrices: Old Question, New Insights , 2019, IEEE Transactions on Visualization and Computer Graphics.

[32]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[33]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[34]  H Pashler,et al.  Cross-dimensional interaction and texture segregation , 1988, Perception & psychophysics.

[35]  A. Treisman Focused attention in the perception and retrieval of multidimensional stimuli , 1977 .

[36]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[37]  David Whitney,et al.  How Capacity Limits of Attention Influence Information Visualization Effectiveness , 2012, IEEE Transactions on Visualization and Computer Graphics.

[38]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[39]  Takayuki Itoh,et al.  Hierarchical data visualization using a fast rectangle-packing algorithm , 2004, IEEE Transactions on Visualization and Computer Graphics.

[40]  Steven Franconeri,et al.  Perception of Average Value in Multiclass Scatterplots , 2013, IEEE Transactions on Visualization and Computer Graphics.

[41]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[42]  Christopher G. Healey,et al.  Choosing effective colours for data visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[43]  David H. Laidlaw,et al.  The relation between visualization size, grouping, and user performance , 2014, IEEE Transactions on Visualization and Computer Graphics.

[44]  C Ware,et al.  Using Color Dimensions to Display Data Dimensions , 1988, Human factors.

[45]  W. R. Garner,et al.  Visual texture segregation based on orientation and hue , 1986, Perception & psychophysics.

[46]  W. Cleveland,et al.  Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods , 1984 .

[47]  Philippe Castagliola,et al.  On the Readability of Graphs Using Node-Link and Matrix-Based Representations: A Controlled Experiment and Statistical Analysis , 2005, Inf. Vis..