Visual exploration of rating datasets and user groups

Abstract The increasing availability of rating datasets (i.e., datasets containing user evaluations on items such as products and services) constitutes a new opportunity in various applications ranging from behavioral analytics to recommendations. In this paper, we describe the design of VugA , a visual enabler for the exploration of rating data and user groups. VugA helps analysts, be they novice analysts or domain experts, acquire an understanding of their data through a seamless integration between exploring users and exploring their collective behavior via group analysis. VugA is data-driven and does not require analysts to know the value distributions in their data. While automated systems can identify and suggest potentially interesting groups, they can do that for well-specified needs (e.g., through SQL QUERIES or constrained mining). VugA helps analysts filter and refine their exploration as they discover what lies in the data. VugA enables analysts to easily acquire statistics about their data, form groups, and find similar and dissimilar groups. While most visual analytics systems are data-dependent, VugA relies on a data model that captures user data in such a way that a variety of group formation and exploration approaches can be used. We describe the architecture of VugA and illustrate its use via tasks and a user study. We conclude with a discussion on future work enabled by VugA .

[1]  Yuanzhe Chen,et al.  Sequence Synopsis: Optimize Visual Summary of Temporal Event Data , 2018, IEEE Transactions on Visualization and Computer Graphics.

[2]  Martin Wattenberg,et al.  How to Use t-SNE Effectively , 2016 .

[3]  Sihem Amer-Yahia,et al.  User Group Analytics Survey and Research Opportunities , 2020, IEEE Transactions on Knowledge and Data Engineering.

[4]  Snehasis Mukhopadhyay,et al.  Interactive pattern mining on hidden data: a sampling-based solution , 2012, CIKM.

[5]  I. Jolliffe Mathematical and Statistical Properties of Population Principal Components , 1986 .

[6]  Luis Gustavo Nonato,et al.  Uncovering Representative Groups in Multidimensional Projections , 2015, Comput. Graph. Forum.

[7]  K. Jöreskog,et al.  Confirmatory Factor Analysis of Ordinal Variables With Misspecified Models , 2010 .

[8]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[9]  Leland Wilkinson The Grammar of Graphics , 1999 .

[10]  Kevin Zeng Hu,et al.  DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows , 2018, HILDA@SIGMOD.

[11]  Thomas Ertl,et al.  VA2: A Visual Analytics Approach for Evaluating Visual Analytics Applications , 2016, IEEE Transactions on Visualization and Computer Graphics.

[12]  Oded Nov,et al.  The Persuasive Power of Data Visualization , 2014, IEEE Transactions on Visualization and Computer Graphics.

[13]  Jiawei Han,et al.  Discovering interesting patterns through user's interactive feedback , 2006, KDD '06.

[14]  Sihem Amer-Yahia,et al.  User group analytics: hypothesis generation and exploratory analysis of user data , 2018, The VLDB Journal.

[15]  Olga Papaemmanouil,et al.  AIDE: An Active Learning-Based Approach for Interactive Data Exploration , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  Laks V. S. Lakshmanan,et al.  Cohort Representation and Exploration , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[17]  Sihem Amer-Yahia,et al.  A Survey of General-Purpose Crowdsourcing Techniques , 2016, IEEE Transactions on Knowledge and Data Engineering.

[18]  John Lee,et al.  Effortless Data Exploration with zenvisage: An Expressive and Interactive Visual Analytics System , 2016, Proc. VLDB Endow..

[19]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[20]  Guangquan Zhang,et al.  TruGRC: Trust-Aware Group Recommendation with Virtual Coordinators , 2019, Future Gener. Comput. Syst..

[21]  Joseph M. Hellerstein,et al.  Data Tweening: Incremental Visualization of Data Transforms , 2017, Proc. VLDB Endow..

[22]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[23]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[24]  Daniel A. Keim,et al.  Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[25]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[26]  Gianni Fenu,et al.  Discovery and representation of the preferences of automatically detected groups: Exploiting the link between group modeling and clustering , 2016, Future Gener. Comput. Syst..

[27]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[28]  Alexandre Termier,et al.  Interactive User Group Analysis , 2015, CIKM.

[29]  Jean-Daniel Fekete,et al.  Interactive information visualization of a million items , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[30]  Daniel A. Keim,et al.  Visual Analytics: Definition, Process, and Challenges , 2008, Information Visualization.

[31]  Carsten Binnig,et al.  IDEBench: A Benchmark for Interactive Data Exploration , 2018, SIGMOD Conference.

[32]  Alex Endert,et al.  Visualization by Demonstration: An Interaction Paradigm for Visual Data Exploration , 2017, IEEE Transactions on Visualization and Computer Graphics.

[33]  Kwan-Liu Ma,et al.  Visual cluster exploration of web clickstream data , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[34]  Fan Zhang,et al.  Recent progress and trends in predictive visual analytics , 2017, Frontiers of Computer Science.

[35]  Arnab Nandi,et al.  Distributed and interactive cube exploration , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[36]  Kai Lawonn,et al.  3D Regression Heat Map Analysis of Population Study Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[37]  Evangelos E. Milios,et al.  LogView: Visualizing Event Log Clusters , 2008, 2008 Sixth Annual Conference on Privacy, Security and Trust.

[38]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .

[39]  Gary Charness,et al.  Journal of Economic Behavior & Organization , 2022 .

[40]  J. Stasko,et al.  Focus+context display and navigation techniques for enhancing radial, space-filling hierarchy visualizations , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[41]  Hassan Chafi,et al.  The LDBC Social Network Benchmark: Interactive Workload , 2015, SIGMOD Conference.

[42]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[43]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[44]  Tova Milo,et al.  Next-Step Suggestions for Modern Interactive Data Analysis Platforms , 2018, KDD.

[45]  Arvind Satyanarayan,et al.  Vega-Lite: A Grammar of Interactive Graphics , 2018, IEEE Transactions on Visualization and Computer Graphics.

[46]  Alvaro Graves,et al.  Techniques to reduce cluttering of RDF visualizations , 2015, Future Gener. Comput. Syst..

[47]  Jian Zhao,et al.  Interactive Exploration of Implicit and Explicit Relations in Faceted Datasets , 2013, IEEE Transactions on Visualization and Computer Graphics.

[48]  Gang Wang,et al.  Unsupervised Clickstream Clustering for User Behavior Analysis , 2016, CHI.