论文信息 - Algorithm selection for sorting and probabilistic inference: a machine learning-based approach

Algorithm selection for sorting and probabilistic inference: a machine learning-based approach

The algorithm selection problem aims at selecting the best algorithm for a given computational problem instance according to some characteristics of the instance. In this dissertation, we first introduce some results from theoretical investigation of the algorithm selection problem. We show, by Rice's theorem, the nonexistence of an automatic algorithm selection program based only on the description of the input instance and the competing hardness and algorithm performance based on Kolmogorov complexity to show that algorithm selection for search is also incomputable. Driven by the theoretical results, we propose a machine learning-based inductive approach using experimental algorithmic methods and machine learning techniques to solve the algorithm selection problem. Experimentally, we have applied the proposed methodology to algorithm selection for sorting and the MPE problem. In sorting, instances with an existing order are easier for some algorithms. We have studied different presortedness measures, designed algorithms to generate permutations with a specified existing order uniformly at random, and applied various learning algorithms to induce sorting algorithm selection models from runtime experimental results. In the MPE problem, the instance characteristics we have studied include size and topological type of the network, network connectedness, skewness of the distributions in Conditional Probability Tables (CPTs), and the proportion and distribution of evidence variables. The MPE algorithms considered include an exact algorithm (clique-tree propagation), two stochastic sampling algorithms (MCMC Gibbs sampling and importance forward sampling), two search-based algorithms (multi-restart hill-climbing and tabu search), and one hybrid algorithm combining both sampling and search (ant colony optimization). Another major contribution of this dissertation is the discovery of multifractal properties of the joint probability distributions of Bayesian networks. With sufficient asymmetry in individual prior and conditional probability distributions, the joint distribution is not only highly skewed, but it also has clusters of high-probability instantiations at all scales. We present a two phase hybrid random sampling and search algorithm to solve the MPE problem exploiting this clustering property. Since the MPE problem (decision version) is NP-complete, the multifractal meta-heuristic can be applied to solve other NP-hard combinatorial optimization problems as well.

W. Hsu | Haipeng Guo

[1] A. Turing. On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[2] H. Rice. Classes of recursively enumerable sets and their decision problems , 1953 .

[3] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .

[4] C. A. R. Hoare,et al. Algorithm 64: Quicksort , 1961, Commun. ACM.

[5] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..

[6] Algorithm 235: Random permutation , 1964, CACM.

[7] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[8] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[9] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[10] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .

[11] Stephen A. Cook,et al. The complexity of theorem-proving procedures , 1971, STOC.