TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks

Deep neural networks with rectified linear (ReLU) activations are piecewise linear functions, where hyperplanes partition the input space into an astronomically high number of linear regions. Previous work focused on counting linear regions to measure the network’s expressive power and on analyzing geometric properties of the hyperplane configurations. In contrast, we aim to understand the impact of the linear terms on network performance, by examining the information encoded in their coefficients. To this end, we derive TropEx, a non-trivial tropical algebrainspired algorithm to systematically extract linear terms based on data. Applied to convolutional and fully-connected networks, our algorithm uncovers significant differences in how the different networks utilize linear regions for generalization. This underlines the importance of systematic linear term exploration, to better understand generalization in neural networks trained with complex data sets.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Bernard Ghanem,et al.  On the Decision Boundaries of Deep Neural Networks: A Tropical Geometry Perspective , 2020, ArXiv.

[4]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[5]  Tommi S. Jaakkola,et al.  Towards Robust, Locally Linear Deep Networks , 2019, ICLR.

[6]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Petros Maragos,et al.  A Tropical Approach to Neural Networks with Piecewise Linear Activations , 2018, ArXiv.

[9]  L. Liu,et al.  On the Number of Linear Regions of Convolutional Neural Networks , 2020, ICML.

[10]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[11]  Raman Arora,et al.  Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..

[12]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[13]  David Rolnick,et al.  Deep ReLU Networks Have Surprisingly Few Activation Patterns , 2019, NeurIPS.

[14]  Christian Tjandraatmadja,et al.  Bounding and Counting Linear Regions of Deep Neural Networks , 2017, ICML.

[15]  David Rolnick,et al.  Complexity of Linear Regions in Deep Networks , 2019, ICML.

[16]  Razvan Pascanu,et al.  On the number of response regions of deep feed forward networks with piece-wise linear activations , 2013, 1312.6098.

[17]  Mikhail Belkin,et al.  Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate , 2018, NeurIPS.

[18]  Surya Ganguli,et al.  On the Expressive Power of Deep Neural Networks , 2016, ICML.

[19]  Dongrui Wu,et al.  Empirical Studies on the Properties of Linear Regions in Deep Neural Networks , 2020, ICLR.

[20]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[24]  Luca Maria Gambardella,et al.  Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.

[25]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[26]  Liwen Zhang,et al.  Tropical Geometry of Deep Neural Networks , 2018, ICML.

[27]  Matthias Hein,et al.  Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.

[28]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.