Adaptive estimated maximum-entropy distribution model

The Estimation of Distribution Algorithm (EDA) model is an optimization procedure through learning and sampling a conditional probabilistic function. The use of conditional density function permits multivariate dependency modelling, which is not captured in a population-based representation, like the classical Genetic Algorithms. The Gaussian model is a simple and widely used model for density estimation. However, an assumption of normality is not realistic for many real-life problems. Alternatively, the maximum-entropy model can be used, which makes no assumption of a normal distribution. One disadvantage of the maximum-entropy model is the learning cost of its parameters. This paper proposes an Adaptive Estimated Maximum-Entropy Distribution (Adaptive MEED) model, which aims to reduce learning complexity of building a model. Adaptive MEED exploits the fact that samples have a low average fitness in the early stage, but they gradually converge to an optima towards the end of the search. Hence, it is not necessary to inference the model with a full account of observed constraints in the early stage of the search. The proposed model attempts to estimate the density function with a dynamic set of samples and active constraints. In addition, the proposed model includes a global sampling function to address the issue of a missing mutation operator. The ergodic convergence properties of the proposed model are discussed with the Markov Chain analysis. The preliminary experimental evaluation shows that the proposed model performs well against genetic algorithms on several clustering problems.

[1]  Kate Smith-Miles,et al.  A New Approach of Eliminating Redundant Association Rules , 2004, DEXA.

[2]  Xindong Wu,et al.  Support vector machines based on K-means clustering for real-time business intelligence systems , 2005, Int. J. Bus. Intell. Data Min..

[3]  Kate Smith-Miles,et al.  An Efficient Compression Technique for Frequent Itemset Generation in Association Rule Mining , 2005, PAKDD.

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  Kate Smith-Miles,et al.  Maximum-entropy estimated distribution model for classification problems , 2006, Int. J. Hybrid Intell. Syst..

[6]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[7]  H. Muhlenbein,et al.  The Factorized Distribution Algorithm for additively decomposed functions , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[8]  Alden H. Wright,et al.  An Estimation of Distribution Algorithm Based on Maximum Entropy , 2004, GECCO.

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Pedro Larrañaga,et al.  Optimization in Continuous Domains by Learning and Simulation of Gaussian Networks , 2000 .

[11]  P. Scheunders,et al.  Feature selection for high-dimensional remote sensing data by maximum entropy principle based optimization , 2001, IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217).

[12]  Günter Rudolph,et al.  Convergence analysis of canonical genetic algorithms , 1994, IEEE Trans. Neural Networks.

[13]  David E. Goldberg,et al.  Finite Markov Chain Analysis of Genetic Algorithms , 1987, ICGA.

[14]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Pedro Larrañaga,et al.  A Review on Estimation of Distribution Algorithms , 2002, Estimation of Distribution Algorithms.

[16]  Marius Iosifescu,et al.  Finite Markov Processes and Their Applications , 1981 .

[17]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[18]  Joshua Goodman,et al.  Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[19]  S. Baluja,et al.  Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space , 1997 .

[20]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[21]  Kate Smith-Miles,et al.  Redundant association rules reduction techniques , 2007, Int. J. Bus. Intell. Data Min..

[22]  Michael D. Vose,et al.  Modeling genetic algorithms with Markov chains , 1992, Annals of Mathematics and Artificial Intelligence.

[23]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[24]  J. N. Kapur,et al.  Entropy optimization principles with applications , 1992 .

[25]  David Taniar,et al.  Exception Rules Mining Based on Negative Association Rules , 2004, ICCSA.

[26]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[27]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[28]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[29]  Joshua Goodman,et al.  Sequential Conditional Generalized Iterative Scaling , 2002, ACL.

[30]  Hussein A. Abbass,et al.  Data Mining: A Heuristic Approach , 2002 .

[31]  David Taniar,et al.  Mining Association Rules in Data Warehouses , 2005, Int. J. Data Warehous. Min..

[32]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  David E. Goldberg,et al.  Linkage Problem, Distribution Estimation, and Bayesian Networks , 2000, Evolutionary Computation.

[34]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms for Feature Subset Selection in Large Dimensionality Domains , 2002 .

[35]  Kate Smith-Miles,et al.  A clustering algorithm based on an estimated distribution model , 2005, Int. J. Bus. Intell. Data Min..

[36]  Hui Xiong,et al.  Mining maximal hyperclique pattern: A hybrid search strategy , 2007, Inf. Sci..

[37]  Isabelle Bloch,et al.  Inexact graph matching by means of estimation of distribution algorithms , 2002, Pattern Recognit..

[38]  Pedro Larrañaga,et al.  Combinatonal Optimization by Learning and Simulation of Bayesian Networks , 2000, UAI.

[39]  Yuh-Jiuan Tsay,et al.  An efficient cluster and decomposition algorithm for mining association rules , 2004, Inf. Sci..

[40]  David J. Miller,et al.  General statistical inference for discrete and mixed spaces by an approximate application of the maximum entropy principle , 2000, IEEE Trans. Neural Networks Learn. Syst..

[41]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[42]  Hitoshi Iba,et al.  Linear and Combinatorial Optimizations by Estimation of Distribution Algorithms , 2002 .

[43]  David E. Goldberg,et al.  The compact genetic algorithm , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[44]  John J. Grefenstette,et al.  Genetic algorithms and their applications , 1987 .

[45]  P. Bosman,et al.  An algorithmic framework for density estimation based evolutionary algorithms , 1999 .

[46]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[47]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[48]  Erick Cantú-Paz,et al.  Efficient and Accurate Parallel Genetic Algorithms , 2000, Genetic Algorithms and Evolutionary Computation.