Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks

This paper proposes a hybrid multilogistic methodology, named logistic regression using initial and radial basis function (RBF) covariates. The process for obtaining the coefficients is carried out in three steps. First, an evolutionary programming (EP) algorithm is applied, in order to produce an RBF neural network (RBFNN) with a reduced number of RBF transformations and the simplest structure possible. Then, the initial attribute space (or, as commonly known as in logistic regression literature, the covariate space) is transformed by adding the nonlinear transformations of the input variables given by the RBFs of the best individual in the final generation. Finally, a maximum likelihood optimization method determines the coefficients associated with a multilogistic regression model built in this augmented covariate space. In this final step, two different multilogistic regression algorithms are applied: one considers all initial and RBF covariates (multilogistic initial-RBF regression) and the other one incrementally constructs the model and applies cross validation, resulting in an automatic covariate selection [simplelogistic initial-RBF regression (SLIRBF)]. Both methods include a regularization parameter, which has been also optimized. The methodology proposed is tested using 18 benchmark classification problems from well-known machine learning problems and two real agronomical problems. The results are compared with the corresponding multilogistic regression methods applied to the initial covariate space, to the RBFNNs obtained by the EP algorithm, and to other probabilistic classifiers, including different RBFNN design methods [e.g., relaxed variable kernel density estimation, support vector machines, a sparse classifier (sparse multinomial logistic regression)] and a procedure similar to SLIRBF but using product unit basis functions. The SLIRBF models are found to be competitive when compared with the corresponding multilogistic regression methods and the RBFEP method. A measure of statistical significance is used, which indicates that SLIRBF reaches the state of the art.

[1]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[2]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[3]  Peter J. Gawthrop,et al.  Neural networks for control systems - A survey , 1992, Autom..

[4]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[5]  Alexander J. Hartemink,et al.  Sequence features of DNA binding sites reveal structural class of associated transcription factor , 2006, Bioinform..

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[8]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .

[9]  S. Sathiya Keerthi,et al.  A Fast Dual Algorithm for Kernel Logistic Regression , 2002, 2007 International Joint Conference on Neural Networks.

[10]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11]  Mark J. L. Orr Optimising the widths of radial basis functions , 1998, Proceedings 5th Brazilian Symposium on Neural Networks (Cat. No.98EX209).

[12]  L.N. de Castro,et al.  An evolutionary clustering technique with local search to design RBF neural network classifiers , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[13]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Philip E. Gill,et al.  Practical optimization , 1981 .

[15]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[16]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[17]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[18]  F. J. Martı́nez-Estudilloa,et al.  Evolutionary product-unit neural networks classifiers , 2008 .

[19]  Yen-Jen Oyang,et al.  Data classification with radial basis function networks based on a novel kernel density estimation algorithm , 2005, IEEE Transactions on Neural Networks.

[20]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Ian T. Nabney,et al.  Efficient Training Of Rbf Networks For Classification , 2004, Int. J. Neural Syst..

[22]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[23]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[24]  Héctor Pomares,et al.  Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation , 2003, IEEE Trans. Neural Networks.

[25]  Bruce A. Whitehead,et al.  Cooperative-competitive genetic evolution of radial basis function centers and widths for time series prediction , 1996, IEEE Trans. Neural Networks.

[26]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[27]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[28]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[29]  Boleslaw K. Szymanski,et al.  Introduction to Scientific Data Mining: Direct Kernel Methods and Applications , 2004, Computationally Intelligent Hybrid Systems.

[30]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[31]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[32]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[33]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[34]  Yung C. Shin,et al.  Sparse Multiple Kernel Learning for Signal Processing Applications , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  César Hervás-Martínez,et al.  JCLEC: a Java framework for evolutionary computation , 2007, Soft Comput..

[36]  César Hervás-Martínez,et al.  Multilogistic regression by means of evolutionary product-unit neural networks , 2008, Neural Networks.

[37]  Huanhuan Chen,et al.  Probabilistic Classification Vector Machines , 2009, IEEE Transactions on Neural Networks.

[38]  Trevor Hastie,et al.  Nonparametric Regression and Classification Part II—Nonparametric Classification , 1994 .

[39]  Sheng Chen,et al.  Combined genetic algorithm optimization and regularized orthogonal least squares learning for radial basis function networks , 1999, IEEE Trans. Neural Networks.

[40]  Jiaguo Qi,et al.  Optimal classification methods for mapping agricultural tillage practices , 2004 .

[41]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[42]  Bruce A. Whitehead,et al.  Genetic evolution of radial basis function coverage using orthogonal niches , 1996, IEEE Trans. Neural Networks.

[43]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[44]  Francisca López-Granados,et al.  Spectral discrimination of Ridolfia segetum and sunflower as affected by phenological stage , 2006 .

[45]  Dingli Yu,et al.  Selecting radial basis function network centers with recursive orthogonal least squares training , 2000, IEEE Trans. Neural Networks Learn. Syst..

[46]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[47]  Cuneyt Guzelis,et al.  Input-output clustering for determining centers of radial basis function network , 1997 .

[48]  Panos J. Antsaklis,et al.  Neural networks for control systems , 1990, IEEE Trans. Neural Networks.

[49]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[50]  Xin Yao,et al.  Global optimisation by evolutionary algorithms , 1997, Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis.

[51]  Bruce A. Whitehead,et al.  Evolving space-filling curves to distribute radial basis functions over an input space , 1994, IEEE Trans. Neural Networks.

[52]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[53]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[54]  Ofer M. Shir,et al.  Experimental optimization by evolutionary algorithms , 2010, GECCO '10.

[55]  De-Shuang Huang,et al.  A mended hybrid learning algorithm for radial basis function neural networks to improve generalization capability , 2007 .

[56]  J. Friedman Multivariate adaptive regression splines , 1990 .

[57]  Bernhard Sick,et al.  Evolutionary optimization of radial basis function classifiers for data mining applications , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[58]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[59]  A. C. Martínez-Estudillo,et al.  Hybridization of evolutionary algorithms and local search by means of a clustering method , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[60]  César Hervás-Martínez,et al.  Logistic regression using covariates obtained by product-unit neural network models , 2007, Pattern Recognit..

[61]  Jieping Ye,et al.  Large-scale sparse logistic regression , 2009, KDD.