Ensembles of diverse neural networks

ii an ensemble with appropriate networks. Existing ensemble methods, in general, train a predefined number of networks and consider all of them for final ensemble. Therefore, performance of an ensemble might be poor when any network performs very badly. On the other hand, ensemble with properly selected networks might always perform better. In this context, an ensemble method is proposed in this thesis that first creates a pool of diverse networks and then apply selection scheme to construct an ensemble with appropriate networks. Three data sampling techniques are considered to create network pool, and two selection schemes are investigated. Based on experimental results on a large number of benchmark problems, the proposed method is found better than other traditional methods with concise ensemble. In this thesis, an ensemble method is also investigated that automatically determines a minimal ensemble architecture for a given problem. To determine minimal architecture, it starts with a single network with a minimal number of hidden units. During training process, it adds additional network(s) with cumulative number(s) of hidden units and the added network specializes in the previously unsolved portion of the input space. Finally all the networks are trained simultaneously to improve the generalization ability. When a single network is shown to achieve acceptable result for a problem, it does not build and returns the single network for the problem. The proposed method is found competitive with existing ensemble methods with minimal architecture when tested on benchmark problems. Further, an indirect communication scheme among the networks is investigated in this study when they are trained for an ensemble and proposed progressive interactive training scheme. In the proposed scheme, networks are trained one after another and interaction among the networks is maintained indirectly via an intermediate space called information center. The idea of using indirect communication is conceived from the communication among biological ants via pheromone. An individual ant decides its travelling path based on existing pheromone on the trail and also it deposits pheromone on its travelling path. The indirect interaction has several benefits over direct interaction scheme of negative correlation learning. The experimental results show that ensembles construction with the proposed training scheme performs well. The idea of indirect interaction is also incorporated with bagging and AdaBoost, and is shown to improve their performance.

[1]  Kazuyuki Murase,et al.  Ensemble with Selected Heterogeneous Neural Networks , 2008 .

[2]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Kazuyuki Murase,et al.  Symmetry Axis Based Object Recognition under Translation, Rotation and Scaling , 2009, Int. J. Neural Syst..

[4]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[5]  Yong Liu,et al.  How to Generate Different Neural Networks , 2007, Trends in Neural Computation.

[6]  Raymond J. Mooney,et al.  Creating diversity in ensembles using artificial data , 2005, Inf. Fusion.

[7]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[8]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Yong Liu,et al.  Generate Different Neural Networks by Negative Correlation Learning , 2005, ICNC.

[11]  Kazuyuki Murase,et al.  A Hybrid Algorithm for Compact Neural Network Ensemble , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[12]  Kazuyuki Murase,et al.  A Pruning Algorithm for Training Cooperative Neural Network Ensembles , 2006, IEICE Trans. Inf. Syst..

[13]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[14]  Kazuyuki Murase,et al.  Training of neural network ensemble through progressive interaction , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[15]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[16]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[17]  Jianxin Wu,et al.  Selectively ensembling neural classifiers , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[18]  Kazuyuki Murase,et al.  A Comparative Study of Data Sampling Techniques for Constructing Neural Network Ensembles , 2009, Int. J. Neural Syst..

[19]  Huanhuan Chen,et al.  Trade-Off Between Diversity and Accuracy in Ensemble Generation , 2006, Multi-Objective Machine Learning.

[20]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[21]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Dean G. Blevins,et al.  Introduction 5-1 , 1969 .

[23]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[24]  James T. Kwok,et al.  Objective functions for training new hidden units in constructive neural networks , 1997, IEEE Trans. Neural Networks.

[25]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[26]  Gonzalo Martínez-Muñoz,et al.  Switching class labels to generate classification ensembles , 2005, Pattern Recognit..

[27]  Lutz Prechelt,et al.  PROBEN 1 - a set of benchmarks and benchmarking rules for neural network training algorithms , 1994 .

[28]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[29]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[30]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[31]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[32]  Robi Polikar,et al.  Random Feature Subset Selection for Ensemble Based Classification of Data with Missing Features , 2007, MCS.

[33]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[34]  Robert P. W. Duin,et al.  Bagging and the Random Subspace Method for Redundant Feature Spaces , 2001, Multiple Classifier Systems.

[35]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[36]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[37]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[38]  Derek Partridge Network generalization differences quantified , 1996, Neural Networks.

[39]  Kazuyuki Murase,et al.  Selecting Neural Network Ensemble Methods Based on Problem's Characteristics: An Empirical Study , 2008 .

[40]  Xin Yao,et al.  Simultaneous training of negatively correlated neural networks in an ensemble , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[41]  Xin Yao,et al.  A constructive algorithm for training cooperative neural network ensembles , 2003, IEEE Trans. Neural Networks.

[42]  R. Polikar,et al.  Bootstrap - Inspired Techniques in Computation Intelligence , 2007, IEEE Signal Processing Magazine.

[43]  Kazuyuki Murase,et al.  A Constructive Approach for Adaptive Neural Network Ensemble Construction , 2006 .

[44]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[45]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[46]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[47]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[48]  Gavin Brown,et al.  Diversity in neural network ensembles , 2004 .

[49]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[50]  Kazuyuki Murase,et al.  A Minimal Neural Network Ensemble Construction Method: A Constructive Approach , 2007, J. Adv. Comput. Intell. Intell. Informatics.

[51]  T. Stützle,et al.  A Review on the Ant Colony Optimization Metaheuristic: Basis, Models and New Trends , 2002 .

[52]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[53]  Kazuyuki Murase,et al.  Neural Network Ensemble Training by Sequential Interaction , 2007, ICANN.

[54]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007 .

[55]  Raymond J. Mooney,et al.  Constructing Diverse Classifier Ensembles using Artificial Training Examples , 2003, IJCAI.

[56]  Kazuyuki Murase,et al.  "A Novel Two-Stage Approach for Translation, Rotation and Scale Invariant Character Recognition" , 2006 .

[57]  Antanas Verikas,et al.  Soft combination of neural classifiers: A comparative study , 1999, Pattern Recognit. Lett..

[58]  Lawrence O. Hall,et al.  Ensemble diversity measures and their application to thinning , 2004, Inf. Fusion.

[59]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[60]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .