Net2Vis – A Visual Grammar for Automatically Generating Publication-Tailored CNN Architecture Visualizations

To convey neural network architectures in publications, appropriate visualizations are of great importance. While most current deep learning papers contain such visualizations, these are usually handcrafted just before publication, which results in a lack of a common visual grammar, significant time investment, errors, and ambiguities. Current automatic network visualization tools focus on debugging the network itself, and are not ideal for generating publication-ready visualizations. Therefore, we present an approach to automate this process by translating network architectures specified in Keras into visualizations that can directly be embedded into any publication. To do so, we propose a visual grammar for convolutional neural networks (CNNs), which has been derived from an analysis of such figures extracted from all ICCV and CVPR papers published between 2013 and 2019. The proposed grammar incorporates visual encoding, network layout, layer aggregation, and legend generation. We have further realized our approach in an online system available to the community, which we have evaluated through expert feedback, and a quantitative study. It not only reduces the time needed to generate publication-ready network visualizations, but also enables a unified and unambiguous visualization design.

[1]  Martial Hebert,et al.  Learning to Extract Motion from Videos in Convolutional Neural Networks , 2016, ACCV.

[2]  Philip T. Kortum,et al.  Determining what individual SUS scores mean: adding an adjective rating scale , 2009 .

[3]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[4]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[5]  Jeffrey Heer,et al.  Crowdsourcing graphical perception: using mechanical turk to assess visualization design , 2010, CHI.

[6]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[8]  Andy Cockburn,et al.  An Evaluation of Cone Trees , 2000, BCS HCI.

[9]  Tamara Munzner,et al.  A Nested Model for Visualization Design and Validation , 2009, IEEE Transactions on Visualization and Computer Graphics.

[10]  Pierre Baldi,et al.  Deep Learning for Drug Discovery and Cancer Research: Automated Analysis of Vascularization Images , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Erik Linstead,et al.  A Deep Learning Approach to Identifying Source Code in Images and Video , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[12]  Ulrik Brandes,et al.  Fast and Simple Horizontal Coordinate Assignment , 2001, GD.

[13]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[14]  Jing Dong,et al.  SSGAN: Secure Steganography Based on Generative Adversarial Networks , 2017, PCM.

[15]  Timo Ropinski,et al.  Single‐image Tomography: 3D Volumes from 2D Cranial X‐Rays , 2017, Comput. Graph. Forum.

[16]  Tamara Munzner,et al.  Visualization Analysis and Design , 2014, A.K. Peters visualization series.

[17]  Gjorgji Strezoski Plug-and-Play Interactive Deep Network Visualization , 2017 .

[18]  Younghoon Kim,et al.  Assessing Effects of Task and Data Distribution on the Effectiveness of Visual Encodings , 2018, Comput. Graph. Forum.

[19]  Martin Wattenberg,et al.  Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow , 2018, IEEE Transactions on Visualization and Computer Graphics.

[20]  Jean-Daniel Fekete,et al.  Hierarchical Aggregation for Information Visualization: Overview, Techniques, and Design Guidelines , 2010, IEEE Transactions on Visualization and Computer Graphics.

[21]  Jun Zhu,et al.  Analyzing the Training Processes of Deep Generative Models , 2018, IEEE Transactions on Visualization and Computer Graphics.

[22]  Martin Wattenberg,et al.  GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation , 2018, IEEE Transactions on Visualization and Computer Graphics.

[23]  Minsuk Kahng,et al.  ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models , 2017, IEEE Transactions on Visualization and Computer Graphics.

[24]  Tomer Michaeli,et al.  Deep-STORM: super-resolution single-molecule microscopy by deep learning , 2018, 1801.09631.

[25]  Howard E Egeth,et al.  Biased competition and visual search: the role of luminance and size contrast , 2007, Psychological research.

[26]  Michael J. Proulx,et al.  Size Matters: Large Objects Capture Attention in Visual Search , 2010, PloS one.

[27]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Minsuk Kahng,et al.  Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers , 2018, IEEE Transactions on Visualization and Computer Graphics.

[29]  Bang Wong,et al.  Points of view: Color blindness , 2011, Nature Methods.

[30]  R. E. Christ Review and Analysis of Color Coding Research for Visual Displays , 1975 .

[31]  Nan Cao,et al.  CNNComparator: Comparative Analytics of Convolutional Neural Networks , 2017, ArXiv.

[32]  Christopher Andreas Clark,et al.  PDFFigures 2.0: Mining figures from research papers , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[33]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[34]  Hans-Peter Seidel,et al.  Deep Shading: Convolutional Neural Networks for Screen Space Shading , 2016, Comput. Graph. Forum.

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Daniel Bruckner ML-o-Scope: A Diagnostic Visualization System for Deep Machine Learning Pipelines , 2014 .

[37]  Emden R. Gansner,et al.  A Technique for Drawing Directed Graphs , 1993, IEEE Trans. Software Eng..

[38]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Adam W. Harley An Interactive Node-Link Visualization of Convolutional Neural Networks , 2015, ISVC.

[40]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Colin Ware,et al.  Information Visualization: Perception for Design , 2000 .

[42]  Christophe Hurter,et al.  The Physiological User's Response as a Clue to Assess Visual Variables Effectiveness , 2009, HCI.

[43]  W. Cleveland,et al.  Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods , 1984 .

[44]  Timo Aila,et al.  Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder , 2017, ACM Trans. Graph..

[45]  Paolo Frasconi,et al.  Off the Beaten Track: Using Deep Learning to Interpolate Between Music Genres , 2018, ArXiv.

[46]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[47]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[48]  Martin Wattenberg,et al.  Direct-Manipulation Visualization of Deep Networks , 2017, ArXiv.

[49]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[51]  Harvey S. Smallman,et al.  The Use of 2D and 3D Displays for Shape-Understanding versus Relative-Position Tasks , 2001, Hum. Factors.

[52]  Zhen Li,et al.  Towards Better Analysis of Deep Convolutional Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[53]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.