Genetic algorithms and particle swarm optimization for exploratory projection pursuit

Exploratory Projection Pursuit (EPP) methods have been developed thirty years ago in the context of exploratory analysis of large data sets. These methods consist in looking for low-dimensional projections that reveal some interesting structure existing in the data set but not visible in high dimension. Each projection is associated with a real valued index which optima correspond to valuable projections. Several EPP indices have been proposed in the statistics literature but the main problem lies in their optimization. In the present paper, we propose to apply Genetic Algorithms (GA) and recent Particle Swarm Optimization (PSO) algorithm to the optimization of several projection pursuit indices. We explain how the EPP methods can be implemented in order to become an efficient and powerful tool for the statistician. We illustrate our proposal on several simulated and real data sets.

[1]  Véronique Achard,et al.  Anomalies detection in hyperspectral imagery using projection pursuit algorithm , 2004, SPIE Remote Sensing.

[2]  Angel R. Martinez,et al.  Computational Statistics Handbook with MATLAB , 2001 .

[3]  Russell C. Eberhart,et al.  Implications and Speculations , 2001 .

[4]  Chein-I Chang,et al.  Unsupervised target detection in hyperspectral images using projection pursuit , 2001, IEEE Trans. Geosci. Remote. Sens..

[5]  Eun-Kyung Lee,et al.  Projection Pursuit for Exploratory Supervised Classification , 2005 .

[6]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[7]  S. Klinke,et al.  Exploratory Projection Pursuit , 1995 .

[8]  M. Clerc,et al.  Particle Swarm Optimization , 2006 .

[9]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[10]  G. Nason,et al.  Design and choice of projection indices , 1992 .

[11]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[12]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[13]  Vasant Honavar,et al.  Visualization for classification problems , 2004 .

[14]  A. Atkinson,et al.  Finding an unknown number of multivariate outliers , 2009 .

[15]  C. Posse Tools for Two-Dimensional Exploratory Projection Pursuit , 1995 .

[16]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[17]  Henri Caussinus,et al.  Exploratory Projection Pursuit , 2010 .

[18]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[19]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[20]  Hans-Paul Schwefel,et al.  Numerical Optimization of Computer Models , 1982 .

[21]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[22]  T. Krink,et al.  Particle swarm optimisation with spatial particle extension , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[23]  Deborah F. Swayne,et al.  Interactive and Dynamic Graphics for Data Analysis - With R and GGobi , 2007, Use R.

[24]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[25]  A. Buja,et al.  Projection Pursuit Indexes Based on Orthonormal Function Expansions , 1993 .

[26]  F. Prieto,et al.  Cluster Identification Using Projections , 2001 .

[27]  P. Hopke,et al.  Exploration of multivariate chemical data by projection pursuit , 1992 .

[28]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[29]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[30]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[31]  Xiaodong Li,et al.  A multimodal particle swarm optimizer based on fitness Euclidean-distance ratio , 2007, GECCO '07.

[32]  D. Massart,et al.  Sequential projection pursuit using genetic algorithms for data mining of analytical data. , 2000, Analytical chemistry.

[33]  Henri Caussinus,et al.  Classification and Generalized Principal Component Analysis , 2007 .

[34]  José A. Malpica,et al.  A projection pursuit algorithm for anomaly detection in hyperspectral imagery , 2008, Pattern Recognit..

[35]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[36]  G. Nason Three‐Dimensional Projection Pursuit , 1995 .

[37]  T. Krink,et al.  Extending particle swarm optimisers with self-organized criticality , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[38]  Alexander A. Lubischew On the Use of Discriminant Functions in Taxonomy , 1962 .

[39]  Jiayang Sun Some Practical Aspects of Exploratory Projection Pursuit , 1993, SIAM J. Sci. Comput..

[40]  J. Kruskal TOWARD A PRACTICAL METHOD WHICH HELPS UNCOVER THE STRUCTURE OF A SET OF MULTIVARIATE OBSERVATIONS BY FINDING THE LINEAR TRANSFORMATION WHICH OPTIMIZES A NEW “INDEX OF CONDENSATION” , 1969 .

[41]  Sigbert Klinke Data Structures for Computational Statistics , 1997 .

[42]  Angel R. Martinez,et al.  Computational Statistics Handbook with MATLAB, Second Edition (Chapman & Hall/Crc Computer Science & Data Analysis) , 2007 .