A Dynamic Neural Network Architecture by Sequential Partitioning of the Input Space

We present a sequential approach to training multilayer perceptrons for pattern classification applications. The network is presented with each item of data only once and its architecture is dynamically adjusted during training. At the arrival of each example, a decision whether to increase the complexity of the network, or simply train the existing nodes is made based on three heuristic criteria. These criteria measure the position of the new item of data in the input space with respect to the information currently stored in the network. During the training process, each layer is assumed to be an independent entity with its particular input space. By adding nodes to each layer, the algorithm is effectively adding a hyperplane to the input space, hence adding a partition in the input space for that layer. When existing nodes are sufficient to accommodate the incoming input, the corresponding hidden nodes will be trained accordingly. Each hidden unit in the network is trained in closed form by means of a recursive least-squares (RLS) algorithm. A local covariance matrix of the data is maintained at each node and the closed form solution is recursively updated. The three criteria are computed from these covariance matrices to keep low computational cost. The performance of the algorithm is illustrated on two problems. The first problem is the two-dimensional Peterson and Barney vowel data. The second problem is a 33-dimensional data derived from a vision system for classifying wheat grains. The sequential nature of the algorithm has an efficient hardware implementation in the form of systolic arrays, and the incremental training idea has better biological plausibility compared with iterative methods.

[1]  S. R. Draper,et al.  The measurement of new characters for cultivar identification in wheat using machine vision , 1986 .

[2]  Simon Haykin,et al.  Introduction to Adaptive Filters , 1984 .

[3]  D. Lowe,et al.  Adaptive radial basis function nonlinearities, and the problem of generalisation , 1989 .

[4]  Sharad Singhal,et al.  Training feed-forward networks with the extended Kalman algorithm , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  A. J. Collins,et al.  Introduction To Multivariate Analysis , 1981 .

[6]  A. Balakrishnan Introduction to Random Processes in Engineering , 1995 .

[7]  Visakan Kadirkamanathan,et al.  Nonlinear adaptive filtering in nonstationary environments , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[9]  John V. McCanny,et al.  VLSI technology and design , 1987 .

[10]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[11]  Hassan M. Ahmed,et al.  Nonlinear adaptive filtering , 1996 .

[12]  P. D. Keefe Observations concerning shape variation in wheat grains. , 1990 .

[13]  Sam-Kit Sin,et al.  An Evolution-Oriented Learning Algorithm , 1992 .

[14]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[15]  M.R. Azimi-Sadjadi,et al.  Recursive node creation in back-propagation neural networks using orthogonal projection method , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[16]  N. Draper,et al.  Applied Regression Analysis. , 1967 .

[17]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.