One way to represent a machine learning algorithm's bias over the hypothesis and instance space is as a pair of probability distributions. This approach has been taken both within Bayesian learning schemes and the framework of U-learnability. However, it is not obvious how an Induc-tive Logic Programming (ILP) system should best be provided with a probability distribution. This paper extends the results of a previous paper by the author which introduced stochastic logic programs as a means of providing a structured deenition of such a probability distribution. Stochastic logic programs are a generalisation of stochastic grammars. A stochastic logic program consists of a set of labelled clauses p : C where p is from the interval 0; 1] and C is a range-restricted deenite clause. A stochastic logic program P has a distributional semantics, that is one which assigns a probability distribution to the atoms of each predicate in the Herbrand base of the clauses in P. These probabilities are assigned to atoms according to an SLD-resolution strategy which employs a stochastic selection rule. It is shown that the probabilities can be computed directly for fail-free logic programs and by normalisation for arbitrary logic programs. The stochastic proof strategy can be used to provide three distinct functions: 1) a method of sampling from the Herbrand base which can be used to provide selected targets or example sets for ILP experiments, 2) a measure of the information content of examples or hypotheses; this can be used to guide the search in an ILP system and 3) a simple method for conditioning a given stochastic logic program on samples of data. Functions 1) and 3) are used to measure the generality of hypotheses in the ILP system Progol4.2. This supports an implementation of a Bayesian technique for learning from positive examples only. This paper is an extension of a paper with the same title which appeared in 12]
[1]
H. Jeffreys.
Logical Foundations of Probability
,
1952,
Nature.
[2]
J. Lukasiewicz.
Logical foundations of probability theory
,
1970
.
[3]
Glenn Shafer,et al.
A Mathematical Theory of Evidence
,
2020,
A Mathematical Theory of Evidence.
[4]
William F. Clocksin,et al.
Programming in Prolog
,
1987,
Springer Berlin Heidelberg.
[5]
Leslie G. Valiant,et al.
A theory of the learnable
,
1984,
STOC '84.
[6]
Nils J. Nilsson,et al.
Probabilistic Logic *
,
2022
.
[7]
Lawrence R. Rabiner,et al.
A tutorial on hidden Markov models and selected applications in speech recognition
,
1989,
Proc. IEEE.
[8]
Ronald Fagin,et al.
Uncertainty, belief, and probability 1
,
1991,
IJCAI.
[9]
David Haussler,et al.
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
,
1991,
COLT '91.
[10]
Michael Kearns,et al.
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
,
1992,
[Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[11]
Stephen Muggleton,et al.
Bayesian inductive logic programming
,
1994,
COLT '94.
[12]
Luc De Raedt,et al.
Inductive Logic Programming: Theory and Methods
,
1994,
J. Log. Program..
[13]
Stephen Muggleton,et al.
A Learnability Model for Universal Representations and Its Application to Top-down Induction of Decision Trees
,
1995,
Machine Intelligence 15.
[14]
Taisuke Sato,et al.
A Statistical Learning Method for Logic Programs with Distribution Semantics
,
1995,
ICLP.
[15]
Stephen Muggleton,et al.
Learning from Positive Data
,
1996,
Inductive Logic Programming Workshop.
[16]
Thomas Lukasiewicz,et al.
Probabilistic Logic Programming
,
1998,
ECAI.