Toward Large-Scale Continuous EDA: A Random Matrix Theory Perspective

Estimations of distribution algorithms (EDAs) are a major branch of evolutionary algorithms (EA) with some unique advantages in principle. They are able to take advantage of correlation structure to drive the search more efficiently, and they are able to provide insights about the structure of the search space. However, model building in high dimensions is extremely challenging, and as a result existing EDAs may become less attractive in large-scale problems because of the associated large computational requirements. Large-scale continuous global optimisation is key to many modern-day real-world problems. Scaling up EAs to large-scale problems has become one of the biggest challenges of the field. This paper pins down some fundamental roots of the problem and makes a start at developing a new and generic framework to yield effective and efficient EDA-type algorithms for large-scale continuous global optimisation problems. Our concept is to introduce an ensemble of random projections to low dimensions of the set of fittest search points as a basis for developing a new and generic divide-and-conquer methodology. Our ideas are rooted in the theory of random projections developed in theoretical computer science, and in developing and analysing our framework we exploit some recent results in nonasymptotic random matrix theory.

[1]  R. Vershynin How Close is the Sample Covariance Matrix to the Actual Covariance Matrix? , 2010, 1004.3484.

[2]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Ata Kabán,et al.  Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions , 2015, Machine Learning.

[4]  G. Lorentz,et al.  Constructive approximation : advanced problems , 1996 .

[5]  Heinz Mühlenbein,et al.  Convergence Theory and Applications of the Factorized Distribution Algorithm , 2015, CIT 2015.

[6]  Alex A. Freitas,et al.  Evolutionary Computation , 2002 .

[7]  Xin Yao,et al.  Unified eigen analysis on multivariate Gaussian based estimation of distribution algorithms , 2008, Inf. Sci..

[8]  Rudolf Ahlswede,et al.  Strong converse for identification via quantum channels , 2000, IEEE Trans. Inf. Theory.

[9]  Peter A. N. Bosman,et al.  On empirical memory design, faster selection of bayesian factorizations and parameter-free gaussian EDAs , 2009, GECCO.

[10]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[11]  Xin Yao,et al.  On the approximation ability of evolutionary optimization with application to minimum set cover , 2010, Artif. Intell..

[12]  Qingfu Zhang,et al.  On the limits of effectiveness in estimation of distribution algorithms , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[13]  R. Vershynin,et al.  Covariance estimation for distributions with 2+ε moments , 2011, 1106.2775.

[14]  Thomas L. Marzetta,et al.  A Random Matrix-Theoretic Approach to Handling Singular Covariance Estimates , 2011, IEEE Transactions on Information Theory.

[15]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[16]  Ata Kabán,et al.  When is 'nearest neighbour' meaningful: A converse theorem and implications , 2009, J. Complex..

[17]  Ponnuthurai Nagaratnam Suganthan,et al.  Benchmark Functions for the CEC'2013 Special Session and Competition on Large-Scale Global Optimization , 2008 .

[18]  Hong Sun,et al.  Smolign: A Spatial Motifs-Based Protein Multiple Structural Alignment Method , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  T. Apostol Mathematical Analysis , 1957 .

[20]  Raymond Ros,et al.  A Simple Modification in CMA-ES Achieving Linear Time and Space Complexity , 2008, PPSN.

[21]  Xin Yao,et al.  Large scale evolutionary optimization using cooperative coevolution , 2008, Inf. Sci..

[22]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[23]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[24]  James N. Knight,et al.  Reducing the space-time complexity of the CMA-ES , 2007, GECCO '07.

[25]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[26]  Ata Kabán New Bounds on Compressive Linear Least Squares Regression , 2014, AISTATS.

[27]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Xin Yao,et al.  Multilevel cooperative coevolution for large scale optimization , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[29]  Ata Kabán,et al.  Random Projections as Regularizers: Learning a Linear Discriminant Ensemble from Fewer Observations than Dimensions , 2013, ACML.

[30]  X. Yao,et al.  An analysis of evolutionary algorithms for finding approximation solutions to hard optimisation problems , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[31]  Ata Kabán,et al.  Classification of mislabelled microarrays using robust sparse logistic regression , 2013, Bioinform..

[32]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[33]  J. S. Marron,et al.  Geometric representation of high dimension, low sample size data , 2005 .

[34]  D. Freedman,et al.  Asymptotics of Graphical Projection Pursuit , 1984 .

[35]  Peter Tiño,et al.  Scaling Up Estimation of Distribution Algorithms for Continuous Optimization , 2011, IEEE Transactions on Evolutionary Computation.

[36]  M. Rudelson,et al.  Non-asymptotic theory of random matrices: extreme singular values , 2010, 1003.2990.

[37]  Jonathan M. Garibaldi,et al.  Parameter Estimation Using Metaheuristics in Systems Biology: A Comprehensive Review , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[38]  Francisco Herrera,et al.  Memetic algorithm with Local search chaining for large scale continuous optimization problems , 2009, 2009 IEEE Congress on Evolutionary Computation.

[39]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[40]  Dirk Thierens,et al.  Benchmarking Parameter-Free AMaLGaM on Functions With and Without Noise , 2013, Evolutionary Computation.

[41]  Michael W. Mahoney Boyd,et al.  Randomized Algorithms for Matrices and Data , 2010 .

[42]  Bin Li,et al.  A restart univariate estimation of distribution algorithm: sampling under mixed Gaussian and Lévy probability distribution , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[43]  Xiaodong Li,et al.  A Comparative Study of CMA-ES on Large Scale Global Optimisation , 2010, Australasian Conference on Artificial Intelligence.

[44]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[45]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[46]  Francisco Herrera,et al.  MA-SW-Chains: Memetic algorithm based on local search chains for large scale continuous global optimization , 2010, IEEE Congress on Evolutionary Computation.

[47]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.