RBF nets, mixture experts, and Bayesian Ying-Yang learning

Abstract The connections of the alternative model for mixture of experts (ME) to the normalized radial basis function (NRBF) nets and extended normalized RBF (ENRBF) nets are established, and the well-known expectation-maximization (EM) algorithm for maximum likelihood learning is suggested to the two types of RBF nets. This new learning technique determines the parameters of the input layer (including the covariance matrices or so-called receptive fields) and the parameters of the output layer in the RBF nets globally, instead of separately training the input layer by the K-means algorithm and the output layer by the least-squares learning as done in most of the existing RBF learning methods. In addition, coordinated competitive learning (CCL) and adaptive algorithms are proposed to approximate the EM algorithm for considerably speeding up the learning of the original and alternative ME models as well as the NRBF and ENRBF nets. Furthermore, the two ME models are linked to the recent proposed Bayesian Ying–Yang (BYY) learning system and theory such that not only the architecture of ME and RBF nets is shown to be more preferred than multilayer architecture, but also a new model selection criterion has been obtained to determine the number of experts and basis functions. A number of experiments are made on the prediction of foreign exchange rate and trading investment as well as piecewise nonlinear regression and piecewise line fitting. As shown in these experiments, the EM algorithm for NRBF nets and ENRBF nets obviously outperforms the conventional RBF learning technique, CCL speeds up the learning considerably with only a slight sacrifice on performance accuracy, the adaptive algorithm gives significant improvements on financial predication and trading investment, as well as that the proposed criterion can select the number of basis functions successfully. In addition, the ENRBF net and the alternative ME model are also shown to be able to implement curve fitting and detection.

[1]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[2]  Visakan Kadirkamanathan,et al.  Sequential Adaptation of Radial Basis Function Neural Networks and its Application to Time-series Prediction , 1990, NIPS 1990.

[3]  Erkki Oja,et al.  Randomized hough transform (rht) : Basic mech-anisms, algorithms, and computational complexities , 1993 .

[4]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[5]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[6]  James D. Keeler,et al.  Layered Neural Networks with Gaussian Hidden Units as Universal Approximations , 1990, Neural Computation.

[7]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  Lei Xu,et al.  Adaptive Rival Penalized Competitive Learning and Combined Linear Predictor with application to financial investment , 1996, IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr).

[9]  Erkki Oja,et al.  A new curve detection method: Randomized Hough transform (RHT) , 1990, Pattern Recognit. Lett..

[10]  Erkki Oja,et al.  Probabilistic and non-probabilistic Hough transforms: overview and comparisons , 1995, Image Vis. Comput..

[11]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[12]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[13]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[14]  L. Devroye A Course in Density Estimation , 1987 .

[15]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[16]  Y. C. Lee,et al.  Information theoretic derivation of network architecture and learning algorithms , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[17]  Christopher G. Atkeson,et al.  Generalization Properties of Radial Basis Functions , 1990, NIPS.

[18]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[19]  Jirí Benes,et al.  On neural networks , 1990, Kybernetika.

[20]  Geoffrey E. Hinton,et al.  An Alternative Model for Mixtures of Experts , 1994, NIPS.

[21]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[22]  Bartlett W. Mel,et al.  How Receptive Field Parameters Affect Neural Learning , 1990, NIPS.

[23]  Jooyoung Park,et al.  Approximation and Radial-Basis-Function Networks , 1993, Neural Computation.

[24]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[25]  Erkki Oja,et al.  Randomized Hough transform (RHT) , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[26]  Lei Xu,et al.  New advances on Bayesian Ying-Yang learning system with Kullback and non-Kullback separation functionals , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[27]  Avijit Saha,et al.  Oriented Non-Radial Basis Functions for Image Coding and Analysis , 1990, NIPS.

[28]  Michael I. Jordan,et al.  Convergence results for the EM approach to mixtures of experts architectures , 1995, Neural Networks.

[29]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[30]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[31]  Lei Xu,et al.  Adaptive supervised learning decision networks for traders and portfolios , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[32]  Adam Krzyzak,et al.  On radial basis function nets and kernel regression: Statistical consistency, convergence rates, and receptive field size , 1994, Neural Networks.

[33]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[34]  A. V.DavidSánchez,et al.  Robustization of a learning method for RBF networks , 1995, Neurocomputing.

[35]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[36]  Lei Xu,et al.  Bayesian Ying Yang System and Theory as A Uni ed Statistical Learning Approach I for Unsupervised and Semi Unsupervised Learning , 1997 .

[37]  Lei Xu,et al.  Recent Advances on Techniques of Static Feedforward Networks with Supervised Learning , 1992, Int. J. Neural Syst..