Do Convolutional Neural Networks Learn Class Hierarchy?

Convolutional Neural Networks (CNNs) currently achieve state-of-the-art accuracy in image classification. With a growing number of classes, the accuracy usually drops as the possibilities of confusion increase. Interestingly, the class confusion patterns follow a hierarchical structure over the classes. We present visual-analytics methods to reveal and analyze this hierarchy of similar classes in relation with CNN-internal data. We found that this hierarchy not only dictates the confusion patterns between the classes, it furthermore dictates the learning behavior of CNNs. In particular, the early layers in these networks develop feature detectors that can separate high-level groups of classes quite well, even after a few training epochs. In contrast, the latter layers require substantially more epochs to develop specialized feature detectors that can separate individual classes. We demonstrate how these insights are key to significant improvement in accuracy by designing hierarchy-aware CNNs that accelerate model convergence and alleviate overfitting. We further demonstrate how our methods help in identifying various quality issues in the training data.

[1]  Angelo Arleo,et al.  Optimal context separation of spiking haptic signals by second-order somatosensory neurons , 2009, NIPS.

[2]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[3]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[4]  Nassir Navab,et al.  A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks , 2016, ArXiv.

[5]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[6]  Ashish Kapoor,et al.  FeatureInsight: Visual support for error-driven feature ideation in text classification , 2015, 2015 IEEE Conference on Visual Analytics Science and Technology (VAST).

[7]  Jean-Daniel Fekete,et al.  Matrix Reordering Methods for Table and Network Visualization , 2016, Comput. Graph. Forum.

[8]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[10]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[11]  Jean-Daniel Fekete,et al.  ZAME: Interactive Large-Scale Graph Visualization , 2008, 2008 IEEE Pacific Visualization Symposium.

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[14]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[15]  Adam W. Harley An Interactive Node-Link Visualization of Convolutional Neural Networks , 2015, ISVC.

[16]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[17]  John T. Stasko,et al.  Value-driven evaluation of visualizations , 2014, BELIV.

[18]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Peter Kontschieder,et al.  Decision Forests, Convolutional Networks and the Models in-Between , 2016, ArXiv.

[21]  Daniel Bruckner ML-o-Scope: A Diagnostic Visualization System for Deep Machine Learning Pipelines , 2014 .

[22]  Desney S. Tan,et al.  EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers , 2009, CHI.

[23]  Jun Wang,et al.  Classification Visualization with Shaded Similarity Matrix , 2002 .

[24]  Abhineet Saxena,et al.  Convolutional neural networks: an illustration in TensorFlow , 2016, XRDS.

[25]  Zhen Li,et al.  Towards Better Analysis of Deep Convolutional Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[26]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[27]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[28]  Lynda Hardman,et al.  Visualization of Confusion Matrix for Non-Expert Users (Poster) , 2014 .

[29]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[30]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Bolei Zhou,et al.  Understanding Intra-Class Knowledge Inside CNN , 2015, ArXiv.

[32]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Paulo E. Rauber,et al.  Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.

[34]  Stephen Guattery,et al.  On the Quality of Spectral Separators , 1998, SIAM J. Matrix Anal. Appl..

[35]  Jason Yosinski,et al.  Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[36]  Kwan-Liu Ma,et al.  Opening the black box - data driven visualization of neural networks , 2005, VIS 05. IEEE Visualization, 2005..

[37]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[38]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[39]  Martin Wattenberg,et al.  Direct-Manipulation Visualization of Deep Networks , 2017, ArXiv.

[40]  Nikolaos Papanikolopoulos,et al.  Scalable Active Learning for Multiclass Image Classification , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[42]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[43]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[44]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[45]  Jarke J. van Wijk,et al.  BaobabView: Interactive construction and analysis of decision trees , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[46]  Desney S. Tan,et al.  Interactive optimization for steering machine classification , 2010, CHI.

[47]  J. B. Kruskal,et al.  Icicle Plots: Better Displays for Hierarchical Clustering , 1983 .

[48]  Daniel A. Keim,et al.  Visual Boosting in Pixel‐based Visualizations , 2011, Comput. Graph. Forum.

[49]  Silvia Miksch,et al.  Visual Methods for Analyzing Probabilistic Classification Data , 2014, IEEE Transactions on Visualization and Computer Graphics.

[50]  James A. Landay,et al.  Gestalt: integrated support for implementation and analysis in machine learning , 2010, UIST.

[51]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[52]  Dorin Comaniciu,et al.  Deep Decision Network for Multi-class Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Andrea Vedaldi,et al.  Visualizing Deep Convolutional Neural Networks Using Natural Pre-images , 2015, International Journal of Computer Vision.

[54]  Bongshin Lee,et al.  Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers , 2017, IEEE Transactions on Visualization and Computer Graphics.

[55]  Alexander M. Rush,et al.  Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, ArXiv.

[56]  Yu-Ru Lin,et al.  UnTangle Map: Visual Analysis of Probabilistic Multi-Label Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[57]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[58]  Thomas Brox,et al.  Inverting Visual Representations with Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[60]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[61]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[62]  Jacques Bertin,et al.  Semiology of Graphics - Diagrams, Networks, Maps , 2010 .

[63]  Lawrence Hubert,et al.  SOME APPLICATIONS OF GRAPH THEORY AND RELATED NON‐METRIC TECHNIQUES TO PROBLEMS OF APPROXIMATE SERIATION: THE CASE OF SYMMETRIC PROXIMITY MEASURES , 1974 .

[64]  Andrea Vedaldi,et al.  Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Luke Yeager,et al.  Effective Visualizations for Training and Evaluating Deep Models , 2016 .

[67]  Erkki Mäkinen,et al.  The Barycenter Heuristic and the Reorderable Matrix , 2005, Informatica.

[68]  Mathieu Aubry,et al.  Understanding Deep Features with Computer-Generated Imagery , 2015, ICCV.

[69]  Sergio Escalera,et al.  ChaLearn Looking at People Challenge 2014: Dataset and Results , 2014, ECCV Workshops.

[70]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[71]  Venkatesh Saligrama,et al.  Efficient Training of Very Deep Neural Networks for Supervised Hashing , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  David Maxwell Chickering,et al.  ModelTracker: Redesigning Performance Analysis Tools for Machine Learning , 2015, CHI.

[73]  Xiting Wang,et al.  Towards better analysis of machine learning models: A visual analytics perspective , 2017, Vis. Informatics.

[74]  Khe Chai Sim,et al.  Towards implicit complexity control using variable-depth deep neural networks for automatic speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[75]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.