The Generalization Complexity Measure for Continuous Input Data

We introduce in this work an extension for the generalization complexity measure to continuous input data. The measure, originally defined in Boolean space, quantifies the complexity of data in relationship to the prediction accuracy that can be expected when using a supervised classifier like a neural network, SVM, and so forth. We first extend the original measure for its use with continuous functions to later on, using an approach based on the use of the set of Walsh functions, consider the case of having a finite number of data points (inputs/outputs pairs), that is, usually the practical case. Using a set of trigonometric functions a model that gives a relationship between the size of the hidden layer of a neural network and the complexity is constructed. Finally, we demonstrate the application of the introduced complexity measure, by using the generated model, to the problem of estimating an adequate neural network architecture for real-world data sets.

[1]  Ingo Wegener,et al.  The complexity of Boolean functions , 1987 .

[2]  Andrew R. Barron,et al.  Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.

[3]  Wlodzislaw Duch,et al.  Make it cheap: Learning with O(nd) complexity , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[4]  Yixian Yang,et al.  Bounds on the number of hidden neurons in three-layer binary neural networks , 2003, Neural Networks.

[5]  Ronald E. Mickens Mathematical Methods for the Natural and Engineering Sciences , 2004, Series on Advances in Mathematics for Applied Sciences.

[6]  Johan Håstad,et al.  Almost optimal lower bounds for small depth circuits , 1986, STOC '86.

[7]  Tin Kam Ho,et al.  Data Complexity in Pattern Recognition (Advanced Information and Knowledge Processing) , 2006 .

[8]  Hao Yu,et al.  Selection of Proper Neural Network Sizes and Architectures—A Comparative Study , 2012, IEEE Transactions on Industrial Informatics.

[9]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  J. Walsh A Closed Set of Normal Orthogonal Functions , 1923 .

[11]  W. A. Evans Sine-wave synthesis using walsh functions , 1987 .

[12]  H. Andrews,et al.  Hadamard transform image coding , 1969 .

[13]  Ah Chung Tsoi,et al.  Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods, and Some New Results , 1998, Neural Networks.

[14]  G. Mirchandani,et al.  On hidden nodes for neural nets , 1989 .

[15]  Masahiko Arai,et al.  Bounds on the number of hidden units in binary-valued three-layer neural networks , 1993, Neural Networks.

[16]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[17]  F. L. Xiong,et al.  A method for estimating the number of hidden neurons in feed-forward neural networks based on information entropy , 2003 .

[18]  Takashi Yoneyama,et al.  Specification of Training Sets and the Number of Hidden Neurons for Multilayer Perceptrons , 2001, Neural Computation.

[19]  Antanas Verikas,et al.  Selecting Variables for Neural Network Committees , 2006, ISNN.

[20]  José Martínez Sotoca,et al.  An analysis of how training data complexity affects the nearest neighbor classifiers , 2007, Pattern Analysis and Applications.

[21]  Yue Liu,et al.  Optimizing number of hidden neurons in neural networks , 2007, Artificial Intelligence and Applications.

[22]  Leonardo Franco,et al.  Generalization ability of Boolean functions implemented in feedforward neural networks , 2006, Neurocomputing.

[23]  Martin Anthony,et al.  The influence of oppositely classified examples on the generalization complexity of Boolean functions , 2006, IEEE Transactions on Neural Networks.

[24]  C. K. Yuen,et al.  Walsh Functions and Their Applications , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[25]  Ian Parberry,et al.  Circuit complexity and neural networks , 1994 .

[26]  Leonardo Franco,et al.  Neural Network Architecture Selection: Can Function Complexity Help? , 2009, Neural Processing Letters.

[27]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .