Enhancing Recursive Supervised Learning Using Clustering and Combinatorial Optimization (RSL-CC)

The use of a team of weak learners to learn a dataset has been shown better than the use of one single strong learner. In fact, the idea is so successful that boosting, an algorithm combining several weak learners for supervised learning, has been considered to be one of the best off-the-shelf classifiers. However, some problems still remain, including determining the optimal number of weak learners and the overfitting of data. In an earlier work, we developed the RPHP algorithm which solves both these problems by using a combination of genetic algorithm, weak learner and pattern distributor. In this paper, we revise the global search component by replacing it with a cluster-based combinatorial optimization. Patterns are clustered according to the output space of the problem, i.e., natural clusters are formed based on patterns belonging to each class. A combinatorial optimization problem is therefore formed, which is solved using evolutionary algorithms. The evolutionary algorithms identify the “easy” and the “difficult” clusters in the system. The removal of the easy patterns then gives way to the focused learning of the more complicated patterns. The problem therefore becomes recursively simpler. Overfitting is overcome by using a set of validation patterns along with a pattern distributor. An algorithm is also proposed to use the pattern distributor to determine the optimal number of recursions and hence the optimal number of weak learners for the problem. Empirical studies show generally good performance when compared to other state-of-the-art methods.

[1]  Chunyu Bao,et al.  Task Decomposition Using Pattern Distributor , 2004 .

[2]  Steven Guan,et al.  Output partitioning of neural networks , 2005, Neurocomputing.

[3]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[4]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[5]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[6]  John H. Holland,et al.  Genetic Algorithms and the Optimal Allocation of Trials , 1973, SIAM J. Comput..

[7]  Laxmi R. Iyer,et al.  MultiLearner Based Recursive Supervised Training , 2006 .

[8]  Steven Guan,et al.  Parallel growing and training of neural networks using output parallelism , 2002, IEEE Trans. Neural Networks.

[9]  Fangming Zhu,et al.  Class decomposition for GA-based classifier agents - a Pitt approach , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[10]  J. Dayhoff,et al.  Artificial neural networks , 2001, Cancer.

[11]  Hiroaki Satoh,et al.  Minimal generation gap model for GAs considering both exploration and exploitation , 1996 .

[12]  Wolfgang Banzhaf,et al.  Dynamic Subset Selection Based on a Fitness Case Topology , 2004, Evolutionary Computation.

[13]  Mikko Lehtokangas Modelling with constructive backpropagation , 1999, Neural Networks.

[14]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[15]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[16]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[17]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[18]  Hajime Kita,et al.  A parallel and modular multi-sieving neural network architecture for constructive learning , 1995 .

[19]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[20]  Xiaogang Ruan,et al.  A neurocomputing model for real coded genetic algorithm with the minimal generation gap , 2004, Neural Computing & Applications.

[21]  Andries Petrus Engelbrecht,et al.  Supervised Training Using an Unsupervised Approach to Active Learning , 2002, Neural Processing Letters.

[22]  Byoung-Tak Zhang,et al.  Genetic Programming with Active Data Selection , 1998, SEAL.