Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks

In this paper, we interpret Deep Neural Networks with Complex Network Theory. Complex Network Theory (CNT) represents Deep Neural Networks (DNNs) as directed weighted graphs to study them as dynamical systems. We efficiently adapt CNT measures to examine the evolution of the learning process of DNNs with different initializations and architectures: we introduce metrics for nodes/neurons and layers, namely Nodes Strength and Layers Fluctuation. Our framework distills trends in the learning dynamics and separates low from high accurate networks. We characterize populations of neural networks (ensemble analysis) and single instances (individual analysis). We tackle standard problems of image recognition, for which we show that specific learning dynamics are indistinguishable when analysed through the solely Link-Weights analysis. Further, Nodes Strength and Layers Fluctuations make unprecedented behaviours emerge: accurate networks, when compared to under-trained models, show substantially divergent distributions with the greater extremity of deviations. On top of this study, we provide an efficient implementation of the CNT metrics for both Convolutional and Fully Connected Networks, to fasten the research in this direction.

[1]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[2]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[3]  Cho-Jui Hsieh,et al.  Towards Stable and Efficient Training of Verifiably Robust Neural Networks , 2019, ICLR.

[4]  Hawoong Jeong,et al.  Googling Social Interactions: Web Search Engine Based Social Network Construction , 2007, PloS one.

[5]  Alberto Testolin,et al.  Emergence of Network Motifs in Deep Neural Networks , 2020, Entropy.

[6]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[7]  Vito Latora,et al.  The network analysis of urban streets: A dual approach , 2006 .

[8]  Boris Hanin,et al.  Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations , 2017, Mathematics.

[9]  Odemir M. Bruno,et al.  Structure and Performance of Fully Connected Neural Networks: Emerging Complex Network Properties , 2021, Physica A: Statistical Mechanics and its Applications.

[10]  Alberto Testolin,et al.  Deep learning systems as complex networks , 2018, Journal of Complex Networks.

[11]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[12]  Jinfeng Yi,et al.  Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[13]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[14]  Massimo Marchiori,et al.  A topological analysis of the Italian electric power grid , 2004 .

[15]  J Martinerie,et al.  Functional modularity of background activities in normal and epileptic brain networks. , 2008, Physical review letters.

[16]  Mykel J. Kochenderfer,et al.  The Marabou Framework for Verification and Analysis of Deep Neural Networks , 2019, CAV.

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .