论文信息 - A Spatial EA Framework for Parallelizing Machine Learning Methods

A Spatial EA Framework for Parallelizing Machine Learning Methods

The scalability of machine learning (ML) algorithms has become increasingly important due to the ever increasing size of datasets and increasing complexity of the models induced. Standard approaches for dealing with this issue generally involve developing parallel and distributed versions of the ML algorithms and/or reducing the dataset sizes via sampling techniques. In this paper we describe an alternative approach that combines features of spatially-structured evolutionary algorithms (SSEAs) with the well-known machine learning techniques of ensemble learning and boosting. The result is a powerful and robust framework for parallelizing ML methods in a way that does not require changes to the ML methods. We first describe the framework and illustrate its behavior on a simple synthetic problem, and then evaluate its scalability and robustness using several different ML methods on a set of benchmark problems from the UC Irvine ML database.

[1] D. Opitz,et al. Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[2] Hans-Paul Schwefel,et al. Parallel Problem Solving from Nature — PPSN IV , 1996, Lecture Notes in Computer Science.

[3] R. Banks. Growth and diffusion phenomena , 1993 .

[4] Robert B. Banks. Growth and diffusion phenomena : mathematical frameworks and applications , 1994 .

[5] T. Can,et al. Coevolution based prediction of protein-protein interactions with reduced training data , 2010, 2010 5th International Symposium on Health Informatics and Bioinformatics.

[6] Kenneth A. De Jong,et al. An Analysis of the Effects of Neighborhood Size and Shape on Local Selection Algorithms , 1996, PPSN.

[7] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[8] David B. Skillicorn,et al. Parallelizing Boosting and Bagging , 2001 .

[9] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[10] Marco Tomassini,et al. Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time (Natural Computing Series) , 2005 .

[11] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..