Extended Mixture of MLP Experts by Hybrid of Conjugate Gradient Method and Modified Cuckoo Search

This paper investigates a new method for improving the learning algorithm of Mixture of Experts (ME) model using a hybrid of Modified Cuckoo Search (MCS) and Conjugate Gradient (CG) as a second order optimization technique. The CG technique is combined with Back-Propagation (BP) algorithm to yield a much more efficient learning algorithm for ME structure. In addition, the experts and gating networks in enhanced model are replaced by CG based Multi-Layer Perceptrons (MLPs) to provide faster and more accurate learning. The CG is considerably depends on initial weights of connections of Artificial Neural Network (ANN), so, a metaheuristic algorithm, the so-called Modified Cuckoo Search is applied in order to select the optimal weights. The performance of proposed method is compared with Gradient Decent Based ME (GDME) and Conjugate Gradient Based ME (CGME) in classification and regression problems. The experimental results show that hybrid MSC and CG based ME (MCS-CGME) has faster convergence and better performance in utilized benchmark data sets.

[1]  Xin-She Yang,et al.  Engineering optimisation by cuckoo search , 2010 .

[2]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[3]  Garrison W. Cottrell,et al.  Organization of face and object recognition in modular neural network models , 1999, Neural Networks.

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  Elif Derya Übeyli,et al.  A modified mixture of experts network structure for ECG beats classification with diverse features , 2005, Eng. Appl. Artif. Intell..

[6]  Ke Chen,et al.  Improved learning algorithms for mixture of experts in multiclass classification , 1999, Neural Networks.

[7]  Jong-Hoon Oh,et al.  Statistical Mechanics of the Mixture of Experts , 1996, NIPS.

[8]  Perry Moerland Mixtures of Experts Estimate A Posteriori Probabilities , 1997, ICANN.

[9]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[10]  Michael I. Jordan,et al.  Convergence results for the EM approach to mixtures of experts architectures , 1995, Neural Networks.

[11]  Michael I. Jordan,et al.  Hierarchies of Adaptive Experts , 1991, NIPS.

[12]  Xin-She Yang,et al.  Engineering optimisation by cuckoo search , 2010, Int. J. Math. Model. Numer. Optimisation.

[13]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[14]  James Kennedy,et al.  Defining a Standard for Particle Swarm Optimization , 2007, 2007 IEEE Swarm Intelligence Symposium.

[15]  G. Viswanathan,et al.  Lévy flights and superdiffusion in the context of biological encounters and random searches , 2008 .

[16]  Steve R. Waterhouse,et al.  Bayesian Methods for Mixtures of Experts , 1995, NIPS.

[17]  Xia Hong,et al.  A Mixture of Experts Network Structure Construction Algorithm for Modelling and Control , 2001, Applied Intelligence.

[18]  C. H. Chen,et al.  Handbook of Pattern Recognition and Computer Vision , 1993 .

[19]  Kenneth Morgan,et al.  Modified cuckoo search: A new gradient free optimisation algorithm , 2011 .

[20]  Reza Ebrahimpour,et al.  Face Detection Using Mixture of MLP Experts , 2007, Neural Processing Letters.

[21]  Charu C. Aggarwal,et al.  [7] A. Asuncion and D. J. Newman. UCI Machine Learning Repository , 2008 .

[22]  Robert P. W. Duin,et al.  An experimental study on diversity for bagging and boosting with linear classifiers , 2002, Inf. Fusion.

[23]  Stamatios V. Kartalopoulos,et al.  Understanding neural networks and fuzzy logic , 1995 .

[24]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[26]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[27]  A. Egemen Yilmaz,et al.  A particle swarm optimization approach for hexahedral mesh smoothing , 2009 .

[28]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[29]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[30]  Ilya Pavlyukevich Lévy flights, non-local search and simulated annealing , 2007, J. Comput. Phys..

[31]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..