论文信息 - Stochastic Logic Programs

Stochastic Logic Programs

One way to represent a machine learning algorithm's bias over the hypothesis and instance space is as a pair of probability distributions. This approach has been taken both within Bayesian learning schemes and the framework of U-learnability. However, it is not obvious how an Induc-tive Logic Programming (ILP) system should best be provided with a probability distribution. This paper extends the results of a previous paper by the author which introduced stochastic logic programs as a means of providing a structured deenition of such a probability distribution. Stochastic logic programs are a generalisation of stochastic grammars. A stochastic logic program consists of a set of labelled clauses p : C where p is from the interval 0; 1] and C is a range-restricted deenite clause. A stochastic logic program P has a distributional semantics, that is one which assigns a probability distribution to the atoms of each predicate in the Herbrand base of the clauses in P. These probabilities are assigned to atoms according to an SLD-resolution strategy which employs a stochastic selection rule. It is shown that the probabilities can be computed directly for fail-free logic programs and by normalisation for arbitrary logic programs. The stochastic proof strategy can be used to provide three distinct functions: 1) a method of sampling from the Herbrand base which can be used to provide selected targets or example sets for ILP experiments, 2) a measure of the information content of examples or hypotheses; this can be used to guide the search in an ILP system and 3) a simple method for conditioning a given stochastic logic program on samples of data. Functions 1) and 3) are used to measure the generality of hypotheses in the ILP system Progol4.2. This supports an implementation of a Bayesian technique for learning from positive examples only. This paper is an extension of a paper with the same title which appeared in 12]

S. Muggleton

[1] H. Jeffreys. Logical Foundations of Probability , 1952, Nature.

[2] J. Lukasiewicz. Logical foundations of probability theory , 1970 .

[3] Glenn Shafer,et al. A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[4] William F. Clocksin,et al. Programming in Prolog , 1987, Springer Berlin Heidelberg.

[5] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[6] Nils J. Nilsson,et al. Probabilistic Logic * , 2022 .

[7] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8] Ronald Fagin,et al. Uncertainty, belief, and probability 1 , 1991, IJCAI.

[9] David Haussler,et al. Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension , 1991, COLT '91.

[10] Michael Kearns,et al. Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[11] Stephen Muggleton,et al. Bayesian inductive logic programming , 1994, COLT '94.

[12] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[13] Stephen Muggleton,et al. A Learnability Model for Universal Representations and Its Application to Top-down Induction of Decision Trees , 1995, Machine Intelligence 15.

[14] Taisuke Sato,et al. A Statistical Learning Method for Logic Programs with Distribution Semantics , 1995, ICLP.

[15] Stephen Muggleton,et al. Learning from Positive Data , 1996, Inductive Logic Programming Workshop.

[16] Thomas Lukasiewicz,et al. Probabilistic Logic Programming , 1998, ECAI.