Quadratic function nodes: Use, structure and training

Abstract When computing on continuously valued features, quadratic function nodes provide a capability for prototypic categorization which linear nodes provide for binary features. This paper extends the formal analysis and empirical measurement of thresholded linear functions of multivalued features to determine the corresponding properties for quadratic functions. As with linear functions, the number of functions, weight size, training speed and number of nodes necessary to represent arbitrary Boolean functions are shown to increase polynomially with the number of distinct, equally-spaced values the input features can assume (that is with the required resolution), and exponentially with the number of features. Also paralleling linear functions, certain interesting subclasses of quadratic functions are shown to be learnable in polynomial time. These include functions which require exponential resources (nodes and training time) if a disjunction or conjunction of linear functions (a convex polygon) is used. Alternatively, if a distributed representation is possible, these functions can be represented with a linear number of linear nodes.

[1]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[2]  F. Gregory Ashby,et al.  Toward a Unified Theory of Similarity and Recognition , 1988 .

[3]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[4]  R. J. Williams,et al.  The logic of activation functions , 1986 .

[5]  P. Suppes,et al.  Contemporary Developments in Mathematical Psychology , 1976 .

[6]  A. R. Curtis,et al.  Standard Mathematical Tables , 1971, The Mathematical Gazette.

[7]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[8]  R. Nosofsky Attention and learning processes in the identification and categorization of integral stimuli. , 1987, Journal of experimental psychology. Learning, memory, and cognition.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  E. Rosch,et al.  Family resemblances: Studies in the internal structure of categories , 1975, Cognitive Psychology.

[11]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[12]  S. Schiffman Introduction to Multidimensional Scaling , 1981 .

[13]  Dennis J. Volper,et al.  Data Structures for Retrieval on Square Grids , 1986, SIAM J. Comput..

[14]  R. Shepard,et al.  Learning and memorization of classifications. , 1961 .

[15]  Dennis J. Volper,et al.  Representing and learning Boolean functions of multivalued features , 1990, IEEE Trans. Syst. Man Cybern..

[16]  Stephen K. Reed,et al.  Pattern recognition and categorization , 1972 .

[17]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[18]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[19]  C. Horan Multidimensional scaling: Combining observations when individuals have different perceptual structures , 1969 .

[20]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.