Algorithms to identify pareto points in multi-dimensional data sets

The focus in this research is on developing a fast, efficient hybrid algorithm to identify the Pareto frontier in multi-dimensional data sets. The hybrid algorithm is a blend of two different base algorithms, the Simple Cull (SC) algorithm that has a low overhead but is of overall high computational complexity, and the Divide & Conquer (DC) algorithm that has a lower computational complexity but has a high overhead. The hybrid algorithm employs aspects of each of the two base algorithms, adapting in response to the properties of the data. Each of the two base algorithms perform better for different classes of data, with the SC algorithm performing best for data sets with few nondominated points, high dimensionality, or fewer total numbers of points, while the DC algorithm performs better otherwise. The general approach to the hybrid algorithm is to execute the following steps in order: (1) Execute one pass of the SC algorithm through the data if merited; (2) Execute the DC algorithm, which recursively splits the data into smaller problem sizes; (3) Switch to the SC algorithm for problem sizes below a certain limit. In order to determine whether Step 1 should be executed, and to determine at what problem size the switch in Step 3 should be made, estimates of both algorithms' run times as a function of the size of the data set, the dimension of the data set, and the expected number of nondominated points are needed. These are developed in the thesis. To aid in increasing the speed and reducing the computational and storage complexity of the algorithm, and to enable the ability for the algorithm to adapt to the data, a canonical transformation of the data to a Lattice Latin Hypercube (LLH) form is developed. The transformation preserves the Pareto property of points but reduces storage space and algorithm run time. In order to test the three algorithms, three different methods for creating randomized data sets with arbitrary dimensionality and numbers of nondominated points are developed. Each of the methods provides insight into the properties of nondominated sets, along with providing test cases for the algorithms. Additionally, a spacecraft design problem is developed to serve as a source of real world test data.

[1]  H. T. Kung,et al.  On Finding the Maxima of a Set of Vectors , 1975, JACM.

[2]  Richard E. Rosenthal,et al.  Principles of multiobjective optimization , 1984 .

[3]  Kenneth L. Clarkson,et al.  Fast linear expected-time algorithms for computing maxima and convex hulls , 1993, SODA '90.

[4]  John E. Dennis,et al.  Problem Formulation for Multidisciplinary Optimization , 1994, SIAM J. Optim..

[5]  Christina Bloebaum,et al.  Ordering design tasks based on coupling strengths , 1994 .

[6]  H. Ishibuchi,et al.  MOGA: multi-objective genetic algorithms , 1995, Proceedings of 1995 IEEE International Conference on Evolutionary Computation.

[7]  Jaroslaw Sobieszczanski-Sobieski,et al.  Multidisciplinary aerospace design optimization - Survey of recent developments , 1996 .

[8]  Srinivas Kodiyalam,et al.  Initial Results of an MDO Method Evaluation Study , 1998 .

[9]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[10]  John E. Renaud,et al.  The ability of objective functions to generate non-convex Pareto frontiers , 1999 .

[11]  John E. Renaud,et al.  Interactive Multiobjective Optimization Procedure , 1999 .

[12]  Le Gruenwald,et al.  A survey of data mining and knowledge discovery software tools , 1999, SKDD.

[13]  E. Antonsson,et al.  Arrow's Theorem and Engineering Design Decision Making , 1999 .

[14]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[15]  Lothar Thiele,et al.  Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[16]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[17]  Timothy W. Simpson,et al.  Metamodels for Computer-based Engineering Design: Survey and recommendations , 2001, Engineering with Computers.

[18]  Beth E Allen On the aggregation of preferences in engineering design , 2001 .

[19]  Erik Granum,et al.  Methods for visual mining of data in Virtual Reality , 2001 .

[20]  Timothy W. Simpson,et al.  Multidimensional Visualization and Its Application to a Design by Shopping Paradigm , 2002 .

[21]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[22]  Mikkel T. Jensen,et al.  Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms , 2003, IEEE Trans. Evol. Comput..

[23]  Ren-Jye Yang,et al.  Approximation methods in multidisciplinary analysis and optimization: a panel discussion , 2004 .

[24]  M. Gavanelli An implementation of Pareto Optimality in CLP ( FD ) , .