Self-Organizing Networks for Nonparametric Regression

Widely known statistical and artificial neural network methods for regression are based on function approximation, i.e. representing an unknown (high-dimensional) function as a decomposition/superposition of simpler (low-dimensional) basis functions. Such methods naturally follow the supervised learning paradigm. In this paper we discuss a different approach to nonparametric regression that is based on unsupervised learning. Unsupervised learning methods are commonly used for modeling an (unknown) density distribution and for Vector Quantization. However, it can be also used for adaptive positioning of units (“knots”) along the regression surface, thereby providing discrete approximation of the unknown function. A method for adaptive positioning of knots called Constrained Topological Mapping(CTM) is discussed in detail. CTM is a modification of the biologically inspired method known as Kohonen’s Self-Organizing Maps (SOM) suitable for regression. SOM and CTM methods effectively combine iterative (flow through) computation and local regularization to achieve robust performance and modeling flexibility. This paper describes SOM/CTM methods in the general framework of adaptive methods for regression. We also suggest several statistically motivated improvements of these methods.

[1]  Vladimir Cherkassky,et al.  Adaptive knot Placement for Nonparametric Regression , 1993, NIPS.

[2]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[3]  Vladimir Cherkassky,et al.  Constrained topological mapping for nonparametric regression analysis , 1991, Neural Networks.

[4]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[5]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .

[6]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[7]  Vladimir Cherkassky,et al.  Conventional and neural network approaches to regression , 1992, Defense, Security, and Sensing.

[8]  Vladimir Cherkassky,et al.  Self-Organizing Neural Network for Non-Parametric Regression Analysis , 1990 .

[9]  George Cybenko,et al.  Complexity Theory of Neural Networks and Classification Problems , 1990, EURASIP Workshop.

[10]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[11]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[12]  V. Cherkassky,et al.  Self-organizing network for regression: efficient implementation and comparative evaluation , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[13]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[14]  Brian D. Ripley,et al.  Neural networks and flexible regression and discrimination , 1994 .

[15]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[16]  Terence D. Sanger,et al.  A tree-structured adaptive network for function approximation in high-dimensional spaces , 1991, IEEE Trans. Neural Networks.

[17]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[18]  Bernd Fritzke,et al.  Growing cell structures--A self-organizing network for unsupervised and supervised learning , 1994, Neural Networks.

[19]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[20]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[21]  Jörg A. Walter,et al.  Nonlinear prediction with self-organizing maps , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[22]  Teuvo Kohonen,et al.  Things you haven't heard about the self-organizing map , 1993, IEEE International Conference on Neural Networks.

[23]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[24]  Lennart Ljung,et al.  Analysis of recursive stochastic algorithms , 1977 .

[25]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[26]  Vladimir Cherkassky,et al.  Data representation for diagnostic neural networks , 1992, IEEE Expert.

[27]  J. Friedman Multivariate adaptive regression splines , 1990 .

[28]  William Finnoff,et al.  Diffusion Approximations for the Constant Learning Rate Backpropagation Algorithm and Resistance to Local Minima , 1992, Neural Computation.

[29]  Risto Miikkulainen,et al.  Incremental grid growing: encoding high-dimensional structure into a two-dimensional feature map , 1993, IEEE International Conference on Neural Networks.

[30]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[31]  Jenq-Neng Hwang,et al.  Projection pursuit learning networks for regression , 1990, [1990] Proceedings of the 2nd International IEEE Conference on Tools for Artificial Intelligence.

[32]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[33]  Halbert White,et al.  Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.

[34]  T. Gasser,et al.  Locally Adaptive Bandwidth Choice for Kernel Regression Estimators , 1993 .

[35]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .

[36]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[37]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[38]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1992, IEEE Trans. Neural Networks.

[39]  Vladimir Cherkassky,et al.  Neural networks and nonparametric regression , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[40]  Jerry M. Mendel,et al.  Adaptive, learning, and pattern recognition systems : theory and applications , 1970 .

[41]  R. Tibshirani,et al.  The II P method for estimating multivariate functions from noisy data , 1991 .

[42]  Venta,et al.  Variants of self-organizing maps , 1989 .

[43]  Vladimir Cherkassky,et al.  Statistical analysis of self-organization , 1995, Neural Networks.