ETALON IMAGES: UNDERSTANDING THE CONVOLUTION NEURAL NETWORKS

In this paper we propose a new technic called etalons, which allows us to interpret the way how convolution network makes its predictions. This mechanism is very similar to voting among different experts. Thereby CNN could be interpreted as a variety of experts, but it acts not like a sum or product of them, but rather represent a complicated hierarchy. We implement algorithm for etalon acquisition based on well-known properties of affine maps. We show that neural net has two high-level mechanisms of voting: first, based on attention to input image regions, specific to current input, and second, based on ignoring specific input regions. We also make an assumption that there is a connection between complexity of the underlying data manifold and the number of etalon images and their quality.

[1]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[2]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[3]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[4]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..