论文信息 - Average case analysis of a learning algorithm for µ-DNF expressions

Average case analysis of a learning algorithm for µ-DNF expressions

In this paper, we present an average case model for analyzing learning algorithms. We show how the average behavior of a learning algorithm can be understood in terms of a single hypothesis that we refer to as the average hypothesis. As a case study, we apply the average case model to a simplified version of Pagallo and Haussler's algorithm for PAG learning μDNF expressions on the uniform distribution [15]. The average case analysis reveals that, as the training sample size m increases, the average hypothesis evolves from an almost random DNF expression to a well structured μDNF expression that represents exactly the target function. The learning curves exhibit a strong threshold behavior and, in some cases, have a terraced structure. That is, as m increases, the average accuracy stays relatively constant for short/long periods, interspersed with periods in which it rises quickly. This nontrivial behavior cannot not be deduced from a simple PAC analysis. The average sample complexity of the algorithm is O(n2), a large improvement over the PAC analysis result of O(n6) reported in [15]. The results of the numerical simulations are in very good agreement with the theoretical predictions

Mostefa Golea

[1] P. Langley,et al. Average-case analysis of a nearest neighbor algorthim , 1993, IJCAI 1993.

[2] M. Opper,et al. On the ability of the optimal perceptron to generalise , 1990 .

[3] Leslie G. Valiant,et al. On the learnability of Boolean formulae , 1987, STOC.

[4] Mario Marchand,et al. On Learning Perceptrons with Binary Weights , 1993, Neural Computation.

[5] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[6] P. J. Green,et al. Probability and Statistical Inference , 1978 .

[7] Pat Langley,et al. Induction of One-Level Decision Trees , 1992, ML.

[8] Mario Marchand,et al. Average case analysis of the clipped Hebb rule for nonoverlapping perception networks , 1993, COLT '93.

[9] Pat Langley,et al. An Analysis of Bayesian Classifiers , 1992, AAAI.

[10] Daniel S. Hirschberg,et al. Average case analysis of a k-CNF learning algorithm , 1991 .