Visual Analytics and Human Involvement in Machine Learning

The rapidly developing AI systems and applications still require human involvement in practically all parts of the analytics process. Human decisions are largely based on visualizations, providing data scientists details of data properties and the results of analytical procedures. Different visualizations are used in the different steps of the Machine Learning (ML) process. The decision which visualization to use depends on factors, such as the data domain, the data model and the step in the ML process. In this chapter, we describe the seven steps in the ML process and review different visualization techniques that are relevant for the different steps for different types of data, models and purposes.

[1]  Rosane Minghim,et al.  Perception-Based Evaluation of Projection Methods for Multidimensional Data Visualization , 2015, IEEE Transactions on Visualization and Computer Graphics.

[2]  Ryan A. Rossi,et al.  Interactive Visual Graph Analytics on the Web , 2015, ICWSM.

[3]  Hadi Fanaee-T,et al.  Event labeling combining ensemble detectors and background knowledge , 2014, Progress in Artificial Intelligence.

[4]  Patrick Breheny,et al.  Visualization of Regression Models Using visreg , 2017, R J..

[5]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[6]  Saleh Basalamah,et al.  SIMILARITY-DISSIMILARITY PLOT FOR HIGH DIMENSIONAL DATA OF DIFFERENT ATTRIBUTE TYPES IN BIOMEDICAL DATASETS , 2012 .

[7]  Tamara Munzner,et al.  A Taxonomy of Visual Cluster Separation Factors , 2012, Comput. Graph. Forum.

[8]  Padhraic Smyth,et al.  Analysis and Visualization of Network Data using JUNG , 2005 .

[9]  Bongshin Lee,et al.  Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers , 2017, IEEE Transactions on Visualization and Computer Graphics.

[10]  Walter G. Kropatsch,et al.  Visualization methods for neural networks , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[11]  Vladimir Cherkassky,et al.  Simple Method for Interpretation of High-Dimensional Nonlinear SVM Classification Models , 2010, DMIN.

[12]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[13]  Kwan-Liu Ma,et al.  What Would a Graph Look Like in this Layout? A Machine Learning Approach to Large Graph Visualization , 2017, IEEE Transactions on Visualization and Computer Graphics.

[14]  Tobias Schreck,et al.  A System for Interactive Visual Analysis of Large Graphs Using Motifs in Graph Editing and Aggregation , 2009, VMV.

[15]  Valerio Pascucci,et al.  Visualizing High-Dimensional Data: Advances in the Past Decade , 2017, IEEE Transactions on Visualization and Computer Graphics.

[16]  Joachim Meyer,et al.  Effects of cognitive styles and data characteristics on visual data mining , 2005, IS&T/SPIE Electronic Imaging.

[17]  Jaegul Choo,et al.  iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[18]  Ben Shneiderman,et al.  Motif simplification: improving network visualization readability with fan, connector, and clique glyphs , 2013, CHI.

[19]  Andreas Holzinger,et al.  Human-Computer Interaction and Knowledge Discovery (HCI-KDD): What Is the Benefit of Bringing Those Two Fields to Work Together? , 2013, CD-ARES.

[20]  Daniel A. Keim,et al.  Challenges in Visual Data Analysis , 2006, Tenth International Conference on Information Visualisation (IV'06).

[21]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[22]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[23]  Nathan Edward Sanders,et al.  A Balanced Perspective on Prediction and Inference for Data Science in Industry , 2019, Issue 1.

[24]  Joachim Meyer,et al.  Multiple Factors that Determine Performance with Tables and Graphs , 1997, Hum. Factors.

[25]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[26]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[27]  Paul Burkhardt,et al.  Graphing trillions of triangles , 2016, Inf. Vis..

[28]  S. Bornholdt,et al.  Scale-free topology of e-mail networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  William Ribarsky,et al.  Visual analytics for complex concepts using a human cognition model , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  Marcus A. Magnor,et al.  Automated Analytical Methods to Support Visual Exploration of High-Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[32]  Lars Kai Hansen,et al.  Visualization of neural networks using saliency maps , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[33]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[34]  Haim Levkowitz,et al.  Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping , 2008, IEEE Transactions on Visualization and Computer Graphics.

[35]  Nikhil Ketkar,et al.  Deep Learning with Python , 2017 .

[36]  G. Grinstein,et al.  Visualizing Graphical Probabilistic Models , 2006 .

[37]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[38]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[39]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[40]  Jake Vanderplas,et al.  Python Data Science Handbook: Essential Tools for Working with Data , 2016 .

[41]  Jing Wu,et al.  Visual Diagnosis of Tree Boosting Methods , 2018, IEEE Transactions on Visualization and Computer Graphics.

[42]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[43]  Tom Fawcett,et al.  Data science for business , 2013 .

[44]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[45]  Rosane Minghim,et al.  Point Placement by Phylogenetic Trees and its Application to Visual Analysis of Document Collections , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[46]  G. Convertino,et al.  HyperTuner: Visual Analytics for Hyperparameter Tuning by Professionals , 2018, 2018 IEEE Workshop on Machine Learning from User Interaction for Visualization and Analytics (MLUI).

[47]  Daniel A. Keim,et al.  Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[48]  Rahul Ramachandran,et al.  ADaM: a data mining toolkit for scientists and engineers , 2005, Comput. Geosci..

[49]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[50]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[51]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[52]  George Baciu,et al.  ModulGraph: modularity-based visualization of massive graphs , 2015, SIGGRAPH Asia Visualization in High Performance Computing.

[53]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[54]  Jarke J. van Wijk,et al.  BaobabView: Interactive construction and analysis of decision trees , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).