The Vapnik-Chervonenkis dimension of graph and recursive neural networks

The Vapnik-Chervonenkis dimension (VC-dim) characterizes the sample learning complexity of a classification model and it is often used as an indicator for the generalization capability of a learning method. The VC-dim has been studied on common feed-forward neural networks, but it has yet to be studied on Graph Neural Networks (GNNs) and Recursive Neural Networks (RecNNs). This paper provides upper bounds on the order of growth of the VC-dim of GNNs and RecNNs. GNNs and RecNNs are from a new class of neural network models which are capable of processing inputs that are given as graphs. A graph is a data structure that generalizes the representational power of vectors and sequences, via the ability to represent dependencies or relationships between feature vectors. It was shown previously that the ability of recurrent neural networks to process sequences increases the VC-dim when compared to the VC-dim of Neural Networks, which are limited to processing vectors. Since graphs are a more general form than sequences, the question arises how this will affect the VC-dimension of GNNs and RecNNs. A main finding in this paper is that the upper bounds on the VC-dim for GNNs and RecNNs are comparable to the upper bounds for recurrent neural networks. The result also suggests that the generalization capability of such models increases with the number of connected nodes.

[1]  Thomas Gärtner,et al.  Kernels and Distances for Structured Data , 2004, Machine Learning.

[2]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[5]  Luca Becchetti,et al.  A reference collection for web spam , 2006, SIGF.

[6]  Alessio Micheli,et al.  Neural Network for Graphs: A Contextual Constructive Approach , 2009, IEEE Transactions on Neural Networks.

[7]  Marek Karpinski,et al.  Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks , 1997, J. Comput. Syst. Sci..

[8]  J. Milnor On the Betti numbers of real varieties , 1964 .

[9]  Nicolai Vorobjov,et al.  Complexity of stratification of semi-Pfaffian sets , 1995, Discret. Comput. Geom..

[10]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[11]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[12]  Barbara Hammer,et al.  Generalization Ability of Folding Networks , 2001, IEEE Trans. Knowl. Data Eng..

[13]  Eduardo D. Sontag,et al.  Vapnik-Chervonenkis Dimension of Recurrent Neural Networks , 1997, Discret. Appl. Math..

[14]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[15]  Eduardo D. Sontag,et al.  Neural Networks with Quadratic VC Dimension , 1995, J. Comput. Syst. Sci..

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Ah Chung Tsoi,et al.  Solving graph data issues using a layered architecture approach with applications to web spam detection , 2013, Neural Networks.

[18]  Peter L. Bartlett,et al.  Vapnik-Chervonenkis dimension of neural nets , 2003 .

[19]  Alessandro Sperduti,et al.  A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.

[20]  Paul W. Goldberg,et al.  Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[21]  Ah Chung Tsoi,et al.  A self-organizing map for adaptive processing of structured data , 2003, IEEE Trans. Neural Networks.

[22]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[23]  M. J. D. Powell,et al.  An efficient method for finding the minimum of a function of several variables without calculating derivatives , 1964, Comput. J..

[24]  Wolfgang Maass,et al.  Perspectives of Current Research about the Complexity of Learning on Neural Nets , 1994 .

[25]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[26]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[27]  A. Gabrielov,et al.  Complexity of computations with Pfaffian and Noetherian functions , 2004 .

[28]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[29]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[30]  Luís B. Almeida,et al.  A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[31]  Ah Chung Tsoi,et al.  Self-Organizing Maps for cyclic and unbounded graphs , 2008, ESANN.

[32]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[33]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[34]  Eduardo Sontag VC dimension of neural networks , 1998 .

[35]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[36]  Franco Scarselli,et al.  Recursive neural networks for processing graphs with labelled edges: theory and applications , 2005, Neural Networks.

[37]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[38]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .